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ABSTRACT 


ON  DIFFERENTIALS,  ASYMPTOTIC  NORMALITY  AND 
AIMDST  SURE  BEHAVIOR  OF  STATISTICAL  FUNCTIONS, 
WITH  APPLICATION  TO  l^STATISTICS  FOR  LOCATION 
PARAMETERS 


Parameters  of  interest  in  statistics  can  often  be  expressed  as  functionals 
T(F)  of  the  underlying  population  distribution  function,  in  which  case  a natural 
sample  analogue  estimator  is  provided  by  the  "statistical  function"  T(F^)  based 
upon  the  sample  distribution  function  F^. 

Several  notions  of  differentiability  of  functionals  T are  formulated,  in- 
cluding innovations  designed  to  broadai  the  scope  of  statistical  application. 
Methodology  for  finding  the  differential,  and  for  utilizing  it  to  characterize 
the  asymptotic  distribution  and  almost  sure  behavior  of  statistical  functions, 
is  presented.  Typically  this  means  asymptotic  normality  and  the  law  of  the 
iterated  logarithm.  Previous  work  of  von  Mises  (1947),  Kallianpur  and  Rao 
(1955),  Filippova  (1962),  Gregory  (1976)  and  Beran  (1977)  is  relevant. 

Application  to  M-estimates  for  location  parameters  is  carried  out.  The 
solution  of  the  equation  /i|»(x  - T(F))dF(x)  = 0 is  formulated  as  an  M-fimctional 
and  conditions  for  its  differentiability  are  investigated.  Asymptotic  nor- 
mality and  the  la^f  of  the  Iterated  logarithm  for  T(F^)  are  established  under 
regularity  conditions  on  <l>  slightly  stronger  than  continuity  and  under  minimal 
restrictions  on  F.  One-step  estimators  and  the  case  of  scale  unknown  are  also 
treated.  Previous  work  of  Huber  (1964),  Hampel  (1974),  Carroll  (1975,  1977), 

Collins  (1976),  Portnoy  (1977)  and  Beran  (1977)  is  augmented. 
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1.  Introductlcaa.  Consider  a functional  T( • ) defined  on  distribution 
functions  G,  for  example  the  variance  functional  T(G)  = /[x  - /xdG(x)]^dG(x). 
Corresponding  to  a sample  X,,  from  a distribution  F,  let  F denote 

X n n 

the  sample  distribution  function  (see  (2.16)).  The  "statistical  function" 
T(Fjj)  is  the  natural  sample  analogue  estimator  of  the  "parameter"  T(F). 

The  first  part  of  this  paper  treats  the  differentiability  of  functionals 
T and  shows  how  it  forms  the  basis  of  a methodology  for  obtsdnlng  the  conver- 
gence theory  of  statistical  functions  T(F^).  Specifically,  it  is  seen  how 
to  characterize  the  asymptotic  distribution  and  almost  sure  behavior  of 
T(Fjj)*  Typicsdly,  this  means  asymptotic  normality  and  the  lau>  of  the 
iterated  logarithm  x 


(1.1) 

(1.2) 


CT(F^)  - T(F)  - w(T,  F)]  ~>  N(0,  o^(T,  F)),  n -►  »; 


^ CT(F^)  - T(F)  - y(T,  F)] 
llm  — ' = 1 w.p.l. 


nr*® 


/2  o^(T,  F)log  log  n 


Of  course,  i^en  y(T,  F)  = 0,  (1.2)  Implies  strong  consistency  of  T(Fjj).  Here, 
convergence  in  distribution  is  denoted  by  "\"ith  probability  1"  is  denoted 

by  "w.p.l,"  and  N(m,  v)  denotes  a random  variable  having  the  normal  distribution 
v/lth  mean  m and  variance  v.  The  quantity  vi(T,  F)  represents  an  asymptotic  bias 
parameter  (usually  0)  and  the  quantity  c (T,  F)  an  asymptotic  variarice  parameter. 

The  functional  representation  of  statistical  parameters  was  first  studied 
in  detail  by  von  Mises  (1947),  who  developed  a theory  of  differentiation  of 
statistical  functions  '^(Fjj)  employed  corresponding  Taylor  expansions  as 
a tool  for  investigation  of  T(F^).  This  work  was  extended  in  the  framework 
of  stochastic  process  theory  by  Filippova  (1962).  A recasting  of  von  Mises* 


approach  in  the  context  of  Frechet  differentiation  was  Introduced  by  Kalllanpur 


and  Rao  (19?$).  This  approach  hypasses  the  troublesome  remainder  term  in  the 

Taylor  e3q>ansion  but  Introduces  a new  technical  difficulty,  the  handling  of  a 

norm.  In  this  vein,  further  development  in  the  modem  analysis  context  of 

Frechet  differentiation  in  Banach  spaces  has  been  provided  by  Gregory  (1976). 

All  of  these  authors  inclement  the  approach  to  characterize  the  asymptotic 

distribution  theory  of  T(Fjj)  for  selected  types  of  functional  T(»)>  but  do 

not  consider  the  question  of  the  almost  sure  behavior  of  T(F  ). 

n 

In  the  present  development,  attention  is  focused  upon  the  differential 
of  a functional  T as  the  key  concept  and  tool.  Although  essentially  in  the 
spirit  of  Frechet  differentiation,  our  orientation  differs  slightly  from  the 
modem  analysis  treatments,  in  that  we  conveniently  avoid  the  requirement  that 
the  domain  of  T be  a linear  space.  Furthermore,  we  enrich  the  notion  of 
differential  in  two  ways  designed  to  broaden  the  scope  of  its  statistical 
application.  One  modification,  called  the  quasi-differential,  permits  a 
slight  degree  of  nonlinearity  in  the  form  of  the  differential.  The  other 
modification  consists  of  stochastic  versions  of  the  (quasi-)  differential. 
Indeed,  the  concept  of  etoahaetio  qnaei~ differential  is  sufficiently  flexible 
that  we  are  able  to  formulate  it  even  for  a sequence  of  statistics  {T^}  not 
associated  with  a functional  T( • ). 

In  Section  2 we  present  these  various  notions  of  differential  and  provide 
methodology  for  finding  the  differential  and  establishing  its  validity  as 
such.  General  discussion  of  the  statistical  role  of  the  differential  is 
provi'i’iad,  and  some  specific  statlstlcsd  applications  are  developed.  lemmas 
2.5  and  2.7  characterize  the  role  of  the  (stochastic  quasi-)  differentlsd  in 
reduction  of  the  problem  of  convergence  of  T(F^)  - T(F)  to  a similaup  problem 
for  a random  variable  of  standard  form,  i.e.,  a sum  g(X^).  These  results 
are  given  without  restriction  on  the  dependence  of  the  X^'s.  As  particular 
applications  of  the  lenmas,  for  the  case  of  independent  X^'s,  Theorems  2.1 


and  2.2  provide  conclusions  of  the  forms  (1.1)  and  (1.2).  Here  there  are  no 
restrictions  imposed  on  the  underlying  distribution  F,  other  than  what  is 
implicitly  required  in  order  that  T(F)  be  well-defined  and  that  the 
dlffei’entlal  of  T(  • ) be  defined.  Thus  the  theorems  are  applicable  in 
connection  with  a wide  range  of  functionals  T( • ) and  distributions  F.  As 
an  illustration  of  the  methodology , the  sample  variance  statistic  is  con- 
sidered. Concluding  Section  2,  some  con5>lementary  discussion  and  details 
6ire  provided. 

For  inplementation  of  the  differential  approach,  a norm  plays  an  inter- 
mediate yet  fundamental  role.  In  specializing  our  methodology  to  obtain  the 
afore-mentioned  Theorems  2.1  and  2.2,  we  utilize  the  "sup  norm"  and  specifi- 
cally make  heavy  use  of  the  random  variable 

(1-3)  = sup  |F  (x)  - F(x)|, 

i.e.,  the  "Kolmogorov-Smimov  statistic,"  for  which  considerable  probabilistic 
theory  is  available.  In  extending  Theorems  2.1  and  2.2  to  the  case  of 
dependent  X^'s,  it  is  necessary  to  deal  only  with  sums  of  the  form  g(X^) 
and  with  the  statistic  D^.  Thus  such  potential  extension  is  straightforward 
and  broad. 

Augmenting  the  methodology  of  Section  2,  we  present  in  Section  3 an 
Inequality  which  is  useful  in  dealing  with  the  sup  norm. 

The  second  part  of  this  paper  utilizes  our  differential  approach  to 
establish  new  j^esults  for  M-estlmates  of  location  parameters.  These  estimators 
correspond  to  fxinctionals  T defined  as  solutions  of  equations  of  the  form 


(1.4) 


/;«(x  - T(F))dF(x)  - 0. 


The  "M-estimate"  of  T(F)  is  then  T(F^).  Under  very  broad  assiimptions  on  and 
F,  we  obtain  asymptotic  normality  and  the  law  of  the  iterated  logarithm  for 


I T( F^ ) , augmenting  previous  work  by  Huber  ( 1964 ) , Hanpel  ( 1968 ) , ( 1974 ) , Carroll 

(1975),  (1977),  Collins  (1976),  Portnoy  (1977),  and  Beran  (1977a,  b).  For 
some  choices  of  f covered  by  earlier  authors'  theorems  on  asymptotic  normality 
of  T(F^),  we  in  addition  characterize  the  almost  sure  behavior  and/or  relax 
the  assumptions  on  F.  V/e  also  cover  important  choices  of  \ii  excluded  by  existing 
theorems  in  the  literature. 

* ! 1 

Section  4 presents  this  development,  commencing  with  a discussion  of 

varieties  of  of  interest:  "Hubers,"  "Hanpels,"  etc.  In  formulating  a | 

1 

' precise  notion  of  the  solution  of  (1.4)  as  a functional  T(F),  it  is  necessary  I 

- * ‘ 

to  take  into  account  that  the  equation  (1.4)  may  have  multiple  solutions  for  j 

some  choices  of  ili.  We  formulate  a verslcai  we  call  the  "M-functional"  and  i 

j 

investigate  conditions  under  which  it  possesses  a stochastic  quasi-differential. 

ChiT  extended  notions  of  differential  are  advantageous  here  in  reducing  the  j 

restrictions  needed  on  \p.  Applying  the  differentiability  thus  established, 

we  present  in  Theorem  4.5  conclusions  of  form  (1.1)  and  (1.2)  for  T(F^),  in 

the  case  of  independent  X^'s.  The  regularity  conditions  on  >!)  are  slightly  . ■ 

stronger  than  continuity,  and  the  regularity  conditions  on  F are  minimal. 

Extension  to  cases  of  dependent  X^'s  would  be  straightforward.  We  also  treat 
one-step  versions  f M-estimators  for  location,  with  conclusions  of  form  (l.l) 
and  (1.2)  given  by  Theorem  4.8.  A different  type  of  extension,  to  the  case 
of  scale  unknown,  is  given  by  Theorem  4.10  and  its  corollary.  Concluding 
Section  4,  we  make  canparlsons  with  other  results  in  the  literature. 

The  discussion  of  the  sample  variance  in  Section  2 entails  incidentally 
the  random  variable  in  (1.3)  and  turns  up  the  following  seemingly  open 

« 

question,  of  possible  interest  to  probabillsts.  Under  what  normalization  by 
1 constants  {a„}  do  we  have 

it  n i 
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(1.5) 


Tim  ^ “ 1 w.p.l  ? 
n 


(In  the  reciprocal  case,  i.e.,  for  a^  D^,  the  answer  is,  of  course,  well- 
known  - see  Lemma  2.8. ) 

2.  Differentials  of  functionals  and  their  statistical  applications. 

2.0.  Prelimincacy  remarks.  Let  T be  a real- valued  functional  defined 
on  a set  F of  distribution  functions.  To  avoid  trivial  complications,  assume 
that  F contains  for  each  x,  -“  < x < »,  the  distribution  function  5^  degenerate 
at  the  value  X.  Also,  assume  that  F is  convex-,  for  each  G and  H in  F,  the 
"line  segment"  joining  G and  H,  i.e.,  the  set  of  distribution  functions 


(1  - A)G  + AH  = G + A(H  - G),  0 < A < 1, 


belongs  to  F.  Denote  by  P(F)  the  linear  space  generated  by  differences 
H - G of  members  of  F.  Note  that  V(F)  may  be  represented  as  {A:  A = c(G  - F), 
for  F and  G in  F and  c real}.  We  shall  consider  P(F)  to  be  equipped  with  a 
norm  1 1 • 1 1 . 

In  this  section  we  first  consider  a basic  notion  of  differential  for 
functionals  T defined  on  distribution  functions,  and  we  consider  methodology 
for  finding  the  form  of  the  differential  and  verifying  its  validity.  Next 
we  briefly  discuss  the  statistical  role  of  the  differential.  Based  on  this 
discussion,  we  then  introduce  certain  modifications  of  the  differential  designed 
to  broaden  the  scope  of  statistical  application.  The  notion  of  differential  is 
formulated  even  for  the  case  of  a sequence  of  statistics  not  generated  by  any 
functional  T.  Following  these  preparations,  specific  statistical  applications 
of  the  differential  approach  are  presented.  In  particular,  theorems  pertaining 
to  weak  and  strong  consistency,  asymptotic  normality,  and  the  law  of  the 
iterated  logarithm  for  statistical  functions  are  developed.  An  illustration 
for  the  sample  variance  statistic  is  carried  out.  Finally,  complementary 
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discussion  regarding  possible  norms  and  other  aspects  is  provided,  and  related 
work  by  other  authors  is  cited. 


I 

I 


1 


2.1.  Basic  concept  of  differential. 

DEFINITION.  V/e  say  that  a functional  T defined  on  F has  a differential 
at  the  point  F c F with  respect  to  the  norm  1 1 • 1 1 and  the  set  Sp  c F if  there 
exists  a quantity  T(F;  A),  defined  on  A e P(F),  which  is  linear  in  the  argument 
A and  satisfies  the  condition 


(2.1) 


lim 

IlG-FlI-K) 

GeGp 


T(G)  - T(F)  - T(F;  G - F) 
riG  - F|| 


(T(F;  a)  is  called  the  "differential*") 

REMARKS,  (i)  By  (2.1)  is  meant  that  for  each  e > 0 there  exists  6 > 0 
such  that  0<||G-F||  <6,Ge  Gp,  implies 

(2.2)  |T(G)  - T(F)  - T(F;  G - F)|  < € ||G  - F||. 


(ii)  To  establish  (2.1)  or  (2.2),  it  suffices  (see  Apostol  (1957),  p.65) 
to  verify  that  it  holds  for  all  sequences  {G^}  in  Gp  satisfying  | |G^  - F|  | -*•  0, 
n 

(iii)  By  linearity  of  T(F;  A)  is  meant  that 

k k 

(2.3)  T(F;  f a.  a.)  = f a.  T(F;  A.) 

i=l  ^ ^ i=l  ^ ^ 

for  Aj^,  ...,  Aj^  eP(F)  and  real  a^,  ..., 

(iv)  In  the  general  context  of  differentiation  in  Banach  spaces,  the 
differential  T(F;  A)  would  be  called  the  Frechet  derivative  of  T.  In  such 
treatments,  the  space  F on  which  T is  defined  is  assumed  to  be  a norro«d  linear 
space.  We  intentionally  avoid  this  assumption  here,  in  order  to  avoid  defining 
the  functional  T at  points  F which  are  not  distribution  functions.  0 


i 
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In  typical  cases,  the  quantity  T(F;  G - F)  may  be  characterized  as  a 
directional  derivative.  The  right-hand  directional  derivative  of  the  functional 
T at  the  point  F in  the  direction  of  G is  defined  as 

(2.4)  D.  «r)  = Urn  - n)  - 

XMy,  t 

provided  that  this  limit  exists.  Note  that  T(F)  is  Just  the  usual  right- 
hand  derivative,  at  t = 0,  of  the  function 

Q(t)  = T(F  + t(G  - F)) 

defined  as  a function  of  a real  variable  t,  0 s t s 1,  for  fixed  distributions 
F and  G.  That  is,  T(F)  = Q'(0).  We  now  show  that  if  the  set  Gp  is  suffi- 
ciently rich,  then  the  differential  T(F;  G - F)  is  given  by  D_  T(F). 

U 

DEFINITICW.  A set  G of  distributions  is  weakly  starehaped  at  F with 
respect  to  F if  for  each  G e F,  the  distributions  F^^  = ( 1 - X )F  + AG  belong 
to  G for  all  sufficiently  small  X > 0.  D 

LEiAiA  2.1.  Suppose  that  T has  a differential  at  F with  respect  to  11*11 
and  Gp.  Suppose  that  Gp  is  weakly  starshaped  at  F with  respect  to  F.  Then^ 
for  each  G c F,  T(F)  exists  and  satisfies 

(2.5)  Dq  T(F)  = T(F;  G - F). 

PROOF.  Note  that  F^  - F = X(G  - F),  so  that  j |F^  - F|  | = x|  |G  - F|  j -►  0 
as  X 0+.  Utilizing  (2.1),  the  convexity  of  F,  and  the  linearity  of  T(F;  d), 
and  taking  X > 0 sufficiently  small  that  F^  e Gp,  we  have 

T(F^)  - T(F)  = T(F;  - F)  + o(  1 | F^  - F|  | ),  X 0+, 

= X[T(F;  G - F)  + o(l)],  X ->  0+. 


Thus  (2.5)  follows.  □ 
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The  significance  of  (2.5)  is  that  it  permits  the  quantity  T(F;  G - F),  which  is 
defined  only  with  respect  to  a specific  norm  | | ° | | , to  be  obtained  as  a quantity 
Dq  T(F)  defined  without  reference  to  any  norm.  This  aspect  v/ill  be  important 
in  a scheme,  now  to  be  introduced,  for  determination  of  the  differential 
T(F;  a). 

A special  role  in  the  present  development,  as  well  as  later  in  the  statis- 
tical application  of  the  differential,  is  played  by  the  function 

(2.6)  T[F;  x]  = T(F;  6^  - F)  - y(T,  F),  -®  < x < », 
where 

(2.7)  n(T,  F)  = /“t(F;  6^  - F)dF(x). 

*•00 

Note  that 

(2.8)  f TCF;  x]dF(x)  = 0. 

.00 

The  function  is  utilized  primarily  in  connection  with  the  following  important 
property  typically  (but  not  always)  satisfied  by  differentials. 

CONDITION  L. 

(2.9)  T(F;  A)  = /°°T[F;  x]dA(x),  A € 0(F).  □ 

.00 

An  important  implication  of  Condition  L,  obtained  by  the  substitution 
A = 6^  - F in  (2.9),  is 

(2.10a)  TQFj  x]  = T(F;  6^  - F),  -»  < x < <», 

or  equivalently 


(2.10b) 


p(T,  F)  = 0. 


(Besides  containing  (2.10),  Condition  L represents  a strengthening  of 
the  linearity  property  of  the  differential.  Indeed,  by  the  usual  linearity 
expressed  by  (2.3),  it  follows  easily  that 

T(F;  G - F)  = /“tCF;  x]dCG(x)  - F(x)]  + u(T,  F) 

^09 

for  the  case  of  G discrete  vdth  finite  support.  Also,  by  linearity  again, 

T(F;  G - H)  = T(Fj  G - F)  - T(F;  H - F).  Thus,  under  (2.10),  we  have  (2.9) 
for  all  A = c(G  - H),  where  c is  constant  and  G and  H are  discrete  with  finite 
support.  Hence  Condition  L merely  extends  (2.9),  in  the  presence  of  (2.10), 
to  general  G and  H. ) 

Now  note  that  when  the  result  of  Lemma  2.1  is  applicable  for  G given  by 
6^,  each  x,  then  Condition  L may  be  written  in  the  form 

(2.11)  T(F;  A)  = /”d^  T(F)dA(x),  A c V(F). 

_oo 

This  suggests  a convenient  scheme  for  determination  of  the  differential. 

First  obtain,  by  routine  calculus  methods  of  differentiation, 

T(F  + t(6^  - F))  - T(F) 

(2.12)  T(F)  = lim  ir , -»  < x < ~. 

^x  t-H)+  ^ 

Then,  motivated  by  the  anticipated  validity  of  (2.11),  adopt  it  as  the  definition 
of  T(F;  a).  Finally,  establish  that  T(F;  A)  so  defined  fulfills  the  definition 
of  the  differential.  The  difficult  part  is  the  latter  step.  The  following 
lemma  characterizes  the  scheme. 

LEMAA  2.2.  Siippose  that  (2.1)  holds  for  T(F;  A)  defined  by  (2.11). 

Then  T(F;  A)  is  the  differential  of  1 at  ¥ with  respect  to  11*11  and  Gp, 
and  Condition  L holds.  Also, 

(2.13)  T[F;  xJ  = T(F)  - /"d  T(F)dF(x),  < x < <»>. 

°x  -®  °x 
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If,  further,  is  weakly  starshaped  at  F with  respect  to  F,  then 


(2.14a) 


and  hence 


/"d  T(F)dF(x)  = 0, 
.00 


(2.14b)  T[F;  x]  = T(F),  -«  < x < ». 

°x 

PROOF.  Clearly  T(F;  A)  defined  by  (2.11)  is  linear  in  the  argument  A. 
Thus  fulfillment  of  (2.1)  con5)letes  the  requirements  for  T(F;  A)  to  be  the 
differential  of  T.  Putting  A = 6^  - F in  (2.11),  we  obtain 


(2.15)  T(F;  6^  - F)  = T(F)  - /*D  T(F)dF(x),  -»  < x < », 

°x  °x 

which  implies  (2.10)  and  hence  (2.13).  From  (2.11)  and  (2.13),  Condition  L 
follows.  The  final  claim  of  the  lemma  is  straightforward  frcm  (2.15)  and 
Lemma  2.1.  □ 


2.2.  The  statistical  role  of  t/ie  differential.  Consider  a sample  of 

(not  necessarily  independent)  observations  ...,  from  a distribution  F. 

I^t  us  denote  by  F the  sample  distribution  function  based  on  X, , ...,  X , i.e. 

n ^ 1'  ' n 


(2.16) 


1 ^ 

F = - I 6 
^ ^i=l 


Many  statistics  of  interest  may  be  represented  either  exactly  or  approximately 
as  "statistical  functions"  T(F^)  for  some  fxinctional  T.  In  the  case  that  T 
possesses  a differential,  the  analysis  of  T(F^)  relative  to  the  associated 
"parameter"  T(F)  may  be  reduced  via  the  differential  to  a problem  of  standard 
form  in  terms  of  a sum  of  random  variables.  Specifically,  under  appropriate 
assumptions  entailing  the  convergence  of  F^  to  F (see  Lemmas  2.5  and  2.7), 
the  asymptotic  behavior  of  T(F^^)  - T(F)  corresponds  closely  to  that  of 
T(F;  F^  - F).  But,  by  linearity  of  the  differential,  the  latter  randcm 
variable  may  be  ?rritten  as  an  average  of  certain  special  random  variables,  i.e. 


(2.17a) 


, n 

T(F;  F^  - F)  = M T(F;  6„  - F) 


(2.17b) 


1 ^ 

= i I TCF;  X.]  + y(T,  F). 
”i=l  ^ 


FrOTi  this  representation,  and  through  appeal  to  central  limit  theory  and  other 
probabilistic  results  for  sums  of  random  variables,  key  results  for  T(F^) 
follow.  The  asymptotic  distribution  theory  (normality)  of  i’(F^)  - T(F) 
will  be  given  in  Theorem  2.1.  The  almost  sure  behavior  will  be  characterized 
by  a lau  of  iterated  logarithm  given  in  Theorem  2.2. 

As  a fxirther  statistical  application  of  the  differential,  the  function 
T[F;  xj  may  be  regarded  as  an  "influence  curve."  For,  by  (2.17),  the  error 
of  approximation  involved  in  estimating  T(F)  by  T(F^)  is  given  approximately  by 

i I T[F;  X.]  + u(T,  F). 

1=1  ^ 

Thus  T[F;  X^]  measures  the  approximate  "influence"  of  the  observation  X^  toward 
the  error  T(F^)  - T(F).  This  notion  is  due  to  Hampel  (1968),  (1974),  who 
defines 


p,  (x)  = D T(F),  -00  < X < ~, 

’ °x 

as  the  influence  curve  of  the  estimator  T(F^)  for  T(F).  (Recall  that  in  the 
presence  of  Condition  L and  the  condition  that  is  v/eakly  starshaped  at  F 
with  respect  to  F,  we  have  TCF;  x]  = T(F).)  A number  of  characteristics 

of  the  function  J » ) are  interpreted  by  Hampel  as  measures  of  robustness 
properties  of  the  estimator  T( F^ ) . 

We  may  interpret  w(T,  F)  as  an  asymptotic  bias  quantity.  In  typical 
applications,  p(T,  F)  = 0. 

2.3.  A quasi-differential.  Before  pursuing  formally  these  statistical 
results  br.sGd  on  the  "diffcrcntir.l  approach,"  we  introduce  a nodiflcatlon 
of  tlio  method  desi.rTiCi  to  broaden  the  scope 
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of  its  statistical  application.  Frcan  the  foregoing  discussion  concerning 
approximation  of  T(Fj^)  - T(F)  by  T(F;  - F),  it  is  clear  that  we  would  be 
equally  well  served  by  an  approximation  of  the  form 


Tp(F^)  » T(F;  F^  - F), 

where  Tp(  • ) is  an  auxiliary  functional  such  that  Tp(F^)  converges  to  Tp(F) 
in  an  appropriate  stochastic  sense.  This  motivates  us  to  allow  T(F;  G - F) 
to  be  replaced,  in  condition  (2.1)  in  the  definition  of  the  differential,  by 
a quantity  of  the  more  general  form  Tp(G)  » T(F;  G - F).  Accordingly,  we 
introduce  the  following  notion  {not  the  same  as  the  "quasi-differential"  of 
Dieudonne  (I960),  p.  151): 

DEFINITION.  We  say  that  a functional  T defined  on  F has  a quasi- 
differential at  the  point  of  F e F with  respect  to  the  norm  11*11,  the  set 
Gp  c F,  and  the  functional  Tp( • ) defined  on  Gp,  if  there  exists  a quantity 
T(F;  a),  defined  on  A e P(F),  which  is  linear  in  the  argument  A and  satisfies 


the  condition 


(2.18a) 


lim 

||G-F||-KD 
G € G„ 


T(G)  - T(F)  - Tp(G)  T(F;  G - F) 


Without  loss  of  generality  we  also  assxmie 


(2.18b) 


Tp(F)  = 1.  □ 


(T(F;  A)  is  called  the  "quasi-differential.") 

In  the  case  that  T„(G)  s 1,  the  quasi-differential  is  in  fact  the  differential. 

r 

In  other  cases,  the  quasi-differential  can  also  in  fact  be  a differential,  but 
the  stronger  version  may  be  more  difficult  to  verify  or  may  entail  additional 
restrictions  on  the  structure  of  T(*).  In  such  cases,  we  have  the  option  of 
following  the  easier  route  and  nevertheless  achieving  a satisfactory  approximation 
to  T(F^)  - T(F). 
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•S'*'. 


I 


i- 

i 


For  the  purpose  of  determining  the  quasi-differential,  the  scheme  based  on 
(2.11)  for  finding  the  differential  is  again  applicable.  The  following  analogue 
of  Lemma  2.2  is  easily  checked. 

LEMIA  2.3.  Suppose  that  (2.18)  holds  for  T(F;  l)  defined  by  (2.11). 

Then  T(Fj  A)  is  the  quasi-differential  <?/  T at  F with  respect  to  ||*|1,  Gp 
and  Tp(*),  ca^d  Condition  L holds.  Also,  (2.13)  holds.  If,  further,  Gp  is 
weakly  -arshaped  at  F with  respect  to  F,  then  (2.14)  holds. 

2.4.  Stoc^tastic  quasi-differentials . Although  the  preceding  extension 
of  the  concept  of  differential  is  adequate  for  many  statistical  applications, 
further  broadening  is  called  for  in  soma  situations,  '..’e  require  merely  that  T 
possess  a qiiasi-differential  in  a suitable  stochastic  sense.  Specifically,- 
(2.18a)  is  required  to  hold  only  for  the  sequence  (F^}.  V/e  thus 
introduce 

DEFINITION.  We  say  that  a functional  T defined  on  F has  a weak  atoahastic 

quasi-differential  at  the  point  F « F with  respect  to  the  norm  | | • | | , the 

sequence  of  random  variables  {X.},  and  the  functional  T„(*)>  if  T_(F  ) is 

1 r r n 

always  defined  and 

(2.19)  1|F^-  F||  -P->  0,  n^». 


and  there  exists  a quantity  T(F;  A),  defined  on  A e 0(F),  which  is  linear  in 
the  argument  A and  satisfies  the  condition 


(2.20) 


T(F^)  - T(F)  - Tp(F^)  T(F;  F^  - F) 

nv-'^n 


n 


Without  loss  of  generality  we  also  assume  (2.18b).  □ (T(F;A)is  called  the 
"stochastic  quasi-differentia].")  Note  that  the  observations  {X^^}  are  not 
required  to  be  independent. 

DEFINITION.  If  -the  convergences  in  (2.19)  and  (2.20)  hold  with  probability 
1,  then  we  say  that  T has  a strong  stochastic  quasi-differential.  □ 
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Note  that  the  choice  of  norm  1 1 * 1 1 must  serve  two  somewhat  conflicting 
imrposes,  corresponding  to  conditions  (2.19)  and  (2.20).  The  first  is  more 
easily  satisfied  for  ||*||  "small,"  whereas  the  other  is  more  easily  satisfied 
if  1 1 *1 1 is  "large." 

We  now  examine  the  connections  between  the  strict  quasi-differential  and 
its  stochastic  versions.  For  this  jjurpose,  put 

T(G)  - T(F)  - T„(G)  T(F;  G - F) 

UG,  F) \W-^\ , if  ||G  - F||  >0, 

(2.21) 

= 0,  if  ||G  - F||  = 0, 

when  the  relevant  quantities  eire  defined. 

UEJiAiIA  2.4.  Suppose  that  T has  a quaai-differential  at  F iHth  respect  to 

||*||,  6p  and  Tp(0. 

(i)  If 

(2.22a)  P{F^  f G„}  ^1,  », 

ii  r 

and 

(2.22b)  IlFj^  - F||  0,  n 

then 

(2.22c)  KF^,  F)  — > 0,  n 

(li)  If 

(2,23a)  P(F^e  G-.,  all  n sufficiently  large)  •=  1 

n r 

and 

(2.23b)  I |F^  - F|  I 0,  n 

then 
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(2.23c) 


L(F^,  F)  — > 0,  n - 


PROOF,  (i)  Let  e > 0 be  given.  By  hypothesis,  there  exists  6 > 0 such 
that  (jo  - F|1  < 6,  G e G-,  inplies  |L(G,  F)|  < e.  Then 


(2.24) 


P{|  L (F^,  F)|  > e)  ^ P{||F„  - F|1  > 6}  + P{F^  i Gp) 


3y  (2.22a)  and  (2.22b),  the  right-hand  side  of  (2.24)  tends  to  0 as  n 
Thus  (2.22c)  follows. 

(ii)  Trivial.  □ 

Ifeider  the  conditions  of  the  lemma,  T has  a weaik  stochastic  quasi-differential 
in  case  (i),  a strong  stochastic  quasi-differential  in  case  (ii). 

The  scheme  previously  discussed  for  finding  the  form  of  the  quasi-differential 
serves  also  in  seeking  a stochastic  version. 

It  is  interesting  and  useful  that  the  concept  of  differential  may  also  be 
formulated  even  when  no  functional  is  explicitly  involved. 

DEFINITION.  We  say  that  a sequence  of  statistics  (T^},  T^  = T^(X^>  •••> 
satisfying 


(2.25a) 


^n  ^ ^o'  " 


has  a weak  stochastio  quasi- differential  at  (T^,  F)  with  respect  to  the  norm 
ll.ll,  the  sequence  of  random  variables  and  the  sequence  of  random  variables 


(2.25b) 


l|F„  - Fll  — > 0,  n ^ 


and  there  exists  a quantity  T(F;  A),  defined  on  A ^ P(F),  which  is  linear  in 


the  argument  A and  satisfies  the  condition 


(2.25c) 


T - T - Z T(F;  F - F) 
non  n 


->  0,  n -»  ».  0 
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DEFINITION.  If  the  convergences  in  (2.25)  hold  with  probability  1,  then  we 

say  that  {T  } has  a strong  stoahastio  quasi-differential.  □ 

''  n 

In  the  remainder  of  this  section,  we  shall  confine  attention  to  the  versions 
based  on  functionals.  However,  in  Section  4 some  application  of  the  preceding 
generalization  will  be  noted. 

2.5.  Statistical  applications  of  stochastic  quasi-differentials.  TijS  role 
of  the  stocliastic  quasi-differential  in  obtainirif-  asyrnjtotic  nornnlity  of 
statistical  functions  is  seen  precisely  from  the  following  result,  whose 
^rpof  is  iinmediate. 

T.rafffA  2.5.  Let  {X^}  be  a sequence  of  observations  (not  necessarily  indepen- 
dent) on  a distHbution  F.  Let  1 be  a functional  defined  on  a (convex)  set  F 
containing  F.  Suppose  that  7 has  a weak  stochastic  quasi-differential  at  F w^th 
respect  to  \\'\\,  (X^}  and  Tp( • ).  Suppose  further  that 

(2.26)  1 iFn  ■ I = ^ 

Then, 

(2.27)  ^ IT(F^)  - T(F)  - Tp(F^)  T(F;  F^  - F)|  -E->  0,  n - 

In  particular,  we  now  make  application  of  Lemma  2.5  for  the  case  of  indepen- 
dent observations  and  with  respect  to  the  norm 

l|h|L  = • 

We  shall  utilize  the  following  result. 

LBuMA  2.6.  Let  {X^}  be  a sequence  of  independent  observations  on  a non- 


deaenerate  distribution  F . Then 


The  proof  of  (2.28)  for  the  case  of  F continuous  was  ^irst  given  by  Kolmogorov 
(1933).  The  distribution  of  Zp  was  given  explicitly  and  seen  not  to  depend  upon 
F.  Extension  to  the  case  of  F having  finitely  many  discontinuities  and  not 
being  purely  atomic  was  obtained  by  Schmid  (1958).  Here  also  the  distribution  of 
Zp  was  given  explicitly;  it  depends  upon  F in  the  case  of  discontinuities.  The 
general  case  is  treated  in  Billingsley  (1968),  Section  16.  By  his  Theorem  16.4 
and  subsequent  discussion,  it  is  seen  that  the  random  variable  Zp  may  be 
characterized  as 

(2.29)  Zp  = |V.^(F(x))|  , 

where  V/°  is  the  "tled-down  V/iener  process"  (or  "Brovmian  Bridge"),  i.e.,  the 
Gaussian  stochastic  process  {V/°(t),  0 5 t s 1}  specified  by  E{V/°(t)}  = 0 and 
E{W(s)  w(t )}  = s(l  - t ),  0 s s 5 t 5 1.  It  is  evident  from  (2.29)  that  Zp  is 
positive  with  probability  1,  except  in  the  case  that  F is  degenerate. 

(For  completeness  we  make  explicit  the  line  of  development  in  Billingsley 
(1968)  leading  to  (2.28)  and  (2.29).  We  may  represent  the  X^^'s  aa  = F”^(U^), 
where  {U^}  is  a sequence  of  independent  uniform  random  variables  on  [0,1].  Let 
be  the  sample  distribution  function  of  , ....  U . Then  F (x)  = G (F(x)), 
_®<x<ce,  so  that 

teA 

where  A = (F(x):  -««x<“}.  For  the  empirical  stochastic  process  Yj^(t)  = 

*^[Gn(t)  - t],  0 ^ t s 1,  Billingsley  (Theorem  16.4)  establishes  that  converges 
in  distribution  to  V.'°.  This  weak  convergence  is  established  in  the  space  D of 
functions  x( • ) on  [0,1]  that  are  right-continuous  and  have  left-hand  limits. 
Consider  the  mapping  h(x)  = )|  D into  the  real  line,  have 

✓ri  I |F  - F|  I = h(Y  ). 

' ' n ' n 
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Now,  with  respect  to  the  Skorohod  topology  in  D,  the  mapping  h is  seen  to  be  con- 
tinuous at  points  x belonging  to  the  space  C of  continuous  functions  on  [0,1]. 


For,  if  converges  to  x in  the  Skorohod  topology,  and  x e C,  then  x^  converges 
to  X in  the  uniform  topology,  and  from  this  it  follows  that  b(x^)  converges  to 
h(x).  Thus  h is  continuous  with  probability  1 under  the  measure  corresponding 
to  V/°.  Hence,  by  Billingsley's  Theorems  5*1  and  16.4,  converges  in  dis- 

tribution to  h(W°)  = Zp. ) 

Also,  we  sliall  be  concerned,  through  the  representation  (2.17),  with  the 
random  variables  T[Fj  X^j , 1 s i < n,  and  with  the  quantity  p(T,  F).  V'e  may 
express  (2.7)  in  the  form 

(2.30)  y(T,  F)  = Ep{T(F;  - F)}. 

By  (2.8),  we  have  Ep{T[F;  X]}  = 0.  Let  us  put 

(2.31)  c^(T,  F)  = Varp{T[F;  X]}. 

THEOREM  2.1.  Let  {X^}  he  a sequence  of  independent  observations  on  a dis- 
tribution F.  Let  1 be  a functional  defined  on  a (convex)  set  F containing  F. 
Suppose  tliat  T ixis  a vcak  stoclustic  quasi-differential  at  F with  respect  to 

11“  I I > and  T„(  - ).  Assume  that  0 < a (T,  F)  < “.  Further,  assume  that 
® 1 r 

(!)  if  u(T,  F)  = 0,  then 

(2.32)  T JF^)  1,  n ^ 

r n 

(ii)  if  p(T,  F ) 0,  then 

(2.33)  ^ 0,  n ->  «. 

Then 

(2.34)  (T(F^)  - T(F)  - u(T,  F)]  N(0,  a^(T,  F)). 
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PROOF.  By  the  independence  assumption,  Lemma  2.6  is  applicable  and  thus  (2.26) 
holds  for  the  norm  II'IL*  Thus,  in  turn.  Lemma  2.5  is  applicable  and  yields  a 
reduction  to  the  random  variable 

i 

(2.35)  TjF^)  [T(F;  - F)  - y(T,  F)]  + u(T,  F)  ^ [TjF  ) - 1] 

r n II  r n 

in  lieu  of  the  random  variable  in  (2.34).  Utilizing  the  representation  (2.17), 
with  either  (2.32)  or  (2.33)  as  appropriate,  we  have  by  routine  application  of 
central  limit  theory  that  the  random  variable  in  (2.35)  converges  in  distribution 
' to  N(0,  o^(T,  F)).  □ 

. In  typical  applications,  the  asymptotic  bias  quantity  u(T,  F)  is  0 and  we 

need  merely  to  verify  (2.32)  instead  of  the  stronger  condition  (2.33).  Of 
course,  in  many  applications  Tp(G)  = 1,  in  which  case  (2.32)  and  (2.33)  hold 
trivially. 

Under  a condition  on  F^  slightly  different  from  (2.26),  but  closely  related 
to  the  latter,  we  may  cliaracterlze  the  almost  sure  behavior  of  T(F^).  The  role  of 
the  stochastic  quasi-differential  is  exhibited  in  the  following  result,  whose 
proof  is  immediate. 

LEJflM  2.7.  Let  {X^}  be  a sequence  of  observations  (not  necessarily  inde- 
pendent) on  a distribution  F.  Let  1 be  a functional  defined  on  a (convex)  set 
F containing  F.  Suppose  that  T has  a strong  stocliastic  quasi-differential  at  F 
with  respect  to  ] | • | | , {X, } and  T^( • ).  Suppose  further  that 

(2.36)  |F^  - F|  I =0  (/log  log  n),  n -»•  »>,  w.p.  1. 

Then 

(2.37)  ^ IT(F^)  - T(F  ) - Ip(F^.)  T(F,  - ?)] 

— . “iiL .»  0,  n > 

/log  log  n 

In  particular,  similar  to  our  application  of  Lemma  2.5,  we  now  consider  the 

kk 
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case  of  independent  observations  and  the  norm  1|°11„*  shall  utilize  the 
following  result. 

LHUA  2.8.  Let  {X^}  be  a sequence  of  independent  observations  on  a distri- 
bution F . Then 


(2.38) 


/n  I If  - F 
' ' n 


= *^t)ll  - FU))  w.p.  1. 


/log  log  n 

The  proof  of  (2.38)  in  the  case  of  F continuous  \7as  given  by  Chung  (1949). 
Extension  to  the  case  of  F having  discontinuities  is  due  to  Richter  (1974). 

V.-e  nov;  establish  a law  of  iterated  logarithm  for  statistical  functions. 
THEORHvI  2.2.  Let  {X^}  be  a sequence  of  independent  observations  on  a dis- 
tribution F.  Let  T be  a functional  defined  on  a (convex)  set  F containing  F. 
Suppose  that  T has  a strong  stochastic  quasi-differential  at  F \,nth  respect  to 

Assume  that  0 < a (T,  F)  < «.  Further,  assume  that 
(i)  if  vi(T,  F)  = 0,  then 


(2.39) 


Tp(Fn)  -^52^  1,  n ->  »; 


(ii)  if  y(T,  F)  0,  then 


(2.40) 


✓n  [T_(F  ) - 1]  , 

F n ^ wpl 

/log  log  n 


0,  n 


Then 

^ [T(F^)  - T(F)  - p(T,  F)] 

= 1 y.p.l  , 

/2  o^(T,  F)  log  log  n 

PROOF.  By  the  independence  assumption  and  Lemma  2.8,  we  have  (2.36)  for 
the  norm  |!*||^.  It  follows  by  Lemma  2.7  that 


(2.41) 


lim 

n^ 


(2.42) 


•In 


- c 


'2n 


- 4 


3n 


wpl 


0,  n 
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o - ( 0 J.“\  \ hnl  H fl  n 


(2.46) 


00 


(2.46)  T(F)  = h(x^,  dF(x^)  dF(x2), 

2 2 

vfhere  h(x^,  X2)  = i(x^  + X2  - 2x^X2).  In  this  case,  evaluation  of  T(  • ) at 


sample  distribution  yields 


the  "sample  variance.' 


In  order  to  explore  the  possibility  of  a differential  for  T( ‘ ),  we  begin 
with  (2.4)  and  readily  obtain 


(2.47) 


and  thus 


Dq  L y)  dF(y)  d[G(x)  - F(x)] 


(2.48) 


^6  T(F)  = 2 [ [^  h(x,  y)  dF(y)  - T(F)], 


or,  after  some  reduction. 


(2.49) 


D£  T(F)  = (x  - - a/, 


where  Wp  and  Cp  denote  the  mean  and  variance  of  F.  Thus,  defining  a "candidate" 
differential  for  T in  correspondence  with  (2.11),  vie  have 


T(F;  ^)  = £„  (x  - Up)^  dA(x)  - o '"  dA(x) 


and,  in  particular. 


(2y50)  T(F;G-F)  = {„  (x  - Up)^dG(x)  - Cp^  = x^d[G(x  )-F(x )]  - 2u^Up  > 2up^, 


where  denotes  the  m.ean  of  G.  Also, 
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00 

(2.51)  T(G)  - T(F)  = x\q{x)  - F(x)]  - ^ ^^2^ 


Hence 


(2.52)  T(G)  - T(F)  - T(F;  G - F)  = -(y^  - 

In  order  to  establish  that  T(F;  A)  is  a differential  of  T with  respect  to  1|»1| 
and  a set  c F,  v^e  need  to  show  that 


(2.53) 


lim 


||G  - F|| 

GeGp, 


00 


L(G,  F)  = 0, 

0 


where 


(2.54) 


L(G,  F)  = 


T(G)  - T(F)  - T(F;  G - F)  -[  xd(G  - F)]‘ 


G - F| 


G - F 


Unfortunately,  it  is  found  by  considering  specific  examples  that  in  general 
L(G,  F)  need  not  ->-Oas  ||G-F||^-+0.  However,  v/e  are  able  relatively  easily 


to  establish  a stochastic  version  of  (_2.53).  \^ite 

j '>=1  - "F> 


(2.55)  L(F^,  F)  = 


i=l 


n 


I (^i  - Up) 
i=l  ^ ^ 


(Fn  - F| 


Assume  that  the  X^'s  are  I.I.D.  By  the  central  limit  theorem,  the  first  factor 
in  (2.55)  converges  in  distribution  to  a finite-valued  random  variable.  By  Lemma 
2.6  and  the  convergence  (2.28), 

(2.56)  1 d 1 

I |F  - F|  1 ^ ' 

' ' n ‘ « 


since  the  function  g(x)  = x ^ is  continuous  v/ith  probability  1 with  respect  to 
the  distribution  of  Zp.  Thus  the  second  factor  in  (2.55)  also  converges  in  dis- 
tribution to  a finite-valued  random  variable.  Finally,  by  the  strong  law  of 
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larpe  numbers,  the  third  factor  in  (2.55)  converges  to  0 with  probability  1.  It 


follows  that 


(2.57) 


L(F^,  F)  0, 


l.e.,  that  T(F;  A)  is  a weak  stochastic  differential  for  T(  * ) at  F with  respect 
to  IhIL  {Xj}.  By  an  extended  version  of  Lemma  2.2,  we  have  that  (2.13) 
holds.  In  conjunction  vrith  (2.6)  and  (2.49),  we  thus  have 


T[F;  X]  = (x  - - cj 


p;  - Op  , -»■<  X < =0, 


and  accordingly 


u(T,  F)  = 0 


0^(1,  F)  = Varp{(X  - Up)^}  = p^(F)  - Op^ 

where  U^(F)  = Ep{(X  - Up)^},  the  4-th  central  moment  of  F.  It  thus  follows  from 
Theorem  2.1  that 


[T(F^)  - T(F)]  M(0,  u^(F)  - up-^). 


a well-known  result  expressing  asymptotic  normality  of  the  sample  variance. 

If  we  now  seek  to  establish  a law  of  iterated  logarithm  for  T(F  ) via 

n 

Theorem  2.2,  we  need  to  establish  that  L(F^,  F)  ■ > 0.  In  place  of  (2.55), 


we  write 


(2.58) 


^ I (X  - Up)  ^ 
L(Fr^,  F)  = - i=l  ^ ^ 

/log  log  n 


log  log  n 

n l|F^  - F|| 


The  first  factor  is  0(l)  with  probability  1,  by  the  classical  law  of  the  iterated 
logarithm  of  Hartman  and  Wintner  (1941).  Thus  it  suffices  to  show  that 
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We  note  that  they  are  closely  related  to  the  U-statistics  of  Hceffding  (1948), 
wherein  asymptotic  normality  is  established.  The  law  of  the  iterated  logarithm 
for  this  class  of  statistic  is  noted  in  Serf ling  (1971). 

2.7.  norms  of  interest.  The  norm 


(2.61)  I |h(x)| 

receives  special  attention  primarily  because  of  the  many  useful  results  available 
for  the  random  variable  | |F  - F| | , as  seen  for  example  in  Lemmas  2.6  and  2.8. 
Noting,  however,  the  conflicting  demands  imposed  on  any  norm  by  the  conditions 
(2.19)  and  (2.20),  we  consider  other  possibilities  also. 
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A useful  alternative  norm  is  given  by 


(2.62) 


sup 

_UI<X<00 


h(x) 

q(x) 


where  q( • ) is  a given  bounded  and  continuous  function,  usually  satisfying  restric- 
tions on  the  rate  of  convergence  to  0 as  x ± ».  For  use  in  the  form  | |F^  - ^ I lq> 
the  choice  of  q( ' ) us'ially  depends  upon  F.  Effective  utilization  of  this  norm  has 
been  made  by  Gregory  (1976)  and  Boos  ( 1977a).  Indeed,  versions  of  Theorem  2.1  and 

2.2  are  available  with  | 1 » j 1 replaced  by  | j ° | | for  q in  a particular  class. 

“ ' q 

Another  norm  of  interest  is  the  variation  nonu. 


(2.63) 


where 


k 

V^,b(ii)  = sup  |h(x^)  - h(x^_^)|, 

the  supremum  being  taken  over  all  partitions  a = x < x.,  < - » ' < x,  = b of  the  inter- 

val  [a,  b] . Except  in  special  cases,  the  norm  I |f  - F| |„  is  not  effective  for 

' n V 

our  purposes.  However,  in  Section  4,  we  shall  utilize  | |h| in  a different  way. 

2.8.  Concluding  remarks.  (i)  Clearly,  the  concepts  presented  here 
may  bv.’  cxtondcd  to  the  casu-  of  dif f uruntation  of  hii-her  order. 


(ii)  The  "mechanical"  aspect  of  the  differential  approach  is  worthy  of  note.  For 

example,  in  connection  with  the  asymptotic  normality  result  Theorem  2.1,  the  variance 
2 

parameter  o (T,  F)  may  be  obtained  mechanically  without  having  yet  established  actual 
existence  of  the  differential. 

(iii)  The  sequence  of  sample  distribution  functions  {F^}  possesses  a unique 
effectiveness  among  sequences  converging  vreakly  to  F.  This  has  been  illustrated  in 
connection  with  the  sample  variance,  for  which  a stochastic  differential  may  be 
established  in  the  absence  of  a nonstochastic  version.  Similarly,  in  Section  4,  a 
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F 

I 


stochastic  quasi-differential  will  be  established  for  a wider  class  of  M-functionals 
than  in  the  nonstochastic  case. 

(iv)  The  "differential  approach"  was  introduced  into  the  statistical  context  by 
von  Mises  (1947).  Rather  than  employing  a differential,  he  developed  the  Taylor 
expansion  of  T(F  + t(F^  - F)),  considered  as  a function  of  the  real  variable  t, 

0 ^ t s 1,  and  treated  the  remainder  term  as  asymptotically  negligible  in  appro- 
priate senses.  This  also  leads  to  (2.27).  The  extended  treatment  by  Filippova  (1962) 
is  in  the  same  vein.  However,  the  development  by  Kallianpur  and  Rao  (1955)  is 
carried  out  in  the  setting  of  Frechet  differentiation,  with  special  reference  to 

the  norm  |1*||^.  Gregory  (1976)  provides  a fuller  development  in  this  context, 
with  special  reference  to  the  norms  of  type  | |- | 1^,  and  applies  his  theory  to  ob- 
tain asymptotic  normality  results  for  a wide  class  of  linear  functions  of  order 
statistics,  under  the  assumption  of  absolute  continuity  of  F.  By  the  use  of  the 
broader  versions  of  differential  technique  which  we  have  developed  in  the  present 
section,  a wider  class  of  linear  functions  of  order  statistics  may  be  treated  and 
the  continuity  of  F may  be  relaxed,  and  almost  sure  behavior  may  be  cheiracterized 
as  well  as  asymptotic  normality.  See  Boos  (l977a,b)  for  full  development.  Finally, 
we  mention  the  recent  vrork  of  Beran  ( 1977a, b)  developing  and  applying  a theory  of 
Frechet  differentiation  of  functionals  with  respect  to  the  Hellinger  metric. 

(v)  Extension  of  our  treatment  to  the  case  of  dependent  observations  on 
F is  straightforward.  All  that  is  needed  is  a handling  of  T[F;  X^]  and  of 

1 If  - FII  for  such  cases. 

(vi)  Extension  to  the  mutti-dimensional  case  is  possible  also.  For  example, 
the  condition  that  *'^||F  - F||  =0(1),  n->®,  needed  for  an  extension  of 

M n ' 'oo  p 

Lemma  2.5,  follows  for  F a distribution  on  R by  a result  of  Kiefer  and  Wolfowitz 
(1958). 

(vii)  Throughout  it  is  assximed  implicitly  that  T(F;  F^^  - F)  is  measurable. 

(viii)  Earlier  we  noted  that  condition  L implies  (2.10b),  i.e.,  y(T,  F)  = 0. 

This  is  also  implied  by  the  condition  that  F be  discrete  with  finite  support. 


il 


j 


( 


( 

! 


i ■ 
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3.  A useful  inequality.  The  endeavor  to  establish  a (stochastic  quasi-) 
differential  often  entails  a quantity  of  the  form 

Jhd(G  - F), 

where  F and  G are  distribution  functions.  For  example,  refer  to  (2.54)  in  our 
treatment  of  the  variance  functional.  In  some  cases  the  following  inequality 
is  quite  useful.  We  shall  exploit  it  in  Section  4.  (See  Section  2.7  for  the 
definitions  of  the  norms . ) 

LE.'f^A  3.1.  Let  the  function  h he  continuous  and  of  bound.ed  variation  on  R. 
Let  G and  F be  distribution  functions.  Then 

(3.1)  l/hd(G  - F)1  s 11^11^  llG  - Fll^. 

The  proof  is  based  on  the  following  two  results.  The  first  is  easily  proved 
(or  see  Natanson  (1961),  p.  232).  The  second  is  given  by  Dunford  and  Schwartz 
(1958),  p.  154. 

LE#IA  3.2.  Suppose  that  f is  hounded  on  R,  g is  of  bounded  variation  on  R, 
and  /fdg  exists.  Then 

(3.2)  |/fdg|  = ||f||„  . ||g||„. 

LEH.tA  3.3.  Let  f and  g be  of  bounded  variation  on  an  interval  (a,  b), 
allowing  a = -®  and.  b = +®.  Suppose  that  one  of  the  functions  is  continuous  in 
(a,  b)  and  that  the  other  is  continuous  on  the  right.  Then 
b b 

(3.3)  / f(x)  dg(x)  + / g(x)  df(x)  = f(b-)  g(b-)  - f(a+)  g(a+). 

PROOF  OF  LEtm  3.1.  Lemma  3.3, 

/hd(G  - F)  = -/(G  - F)dh  + h(»)(G  - F)(»)  - h(-»)(G  - F)(-«). 

Since  lh(±®)|  < ®,  we  have  h(“>)(G  - F)(®)  = h(-“)(G  - F)(-«)  = 0.  By  Lensna  3.2, 
we  obtain  the  desired  result.  □ 
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4.  Application  to  M-estimation  of  location  parameters. 


4.0.  Ppelimimry  remarks.  The  setting  for  "M-estimation"  of  a location 


parami_ter  is  based  on  a sample  from  a distribution  F and  a function  i}/ 

such  that  the  solution  T of  the  equation 


(4.1)  Ji>(x  - T)dF(x)  = 0 

coincides  with  the  parameter  in  question.  In  this  situation,  a natural  estimator 
of  the  parameter  is  provided  by  the  solution  of  the  analogous  equation  based 
on  the  sample  distribution  function  in  place  of  F,  i.e.,  the  equation 

(4.2)  J,^(x  - T^)dF^(x)  = - T^)  = 0. 

In  this  section  we  first  examine  some  particular  i|)  functions  of  interest  in 
such  estimation  problems.  Then  we  give  a precise  formulation  of  the  problem, 
whereby  the  solution  T of  (4.1)  becomes  a well-defined  functional  T(F)  defined 
on  distributions  F,  in  which  case  the  solution  T^  of  (4.2)  is  just  T(F^).  The 
formulation  will  take  into  account  the  scope  of  types  of  function  of  practical 
interest.  We  then  characterize  the  behavior  of  T(F^)  via  the  methodology  of 
Section  2.  As  a preliminary  step,  the  behavior  of  T(G^)  for  weakly  convergent 
sequences  {G^}  is  explored.  This  yields  results  on  the  strong  consistency  of 
T(F^).  Then  we  establish  that,  under  very  broad  assumptions  regarding  il/  and  F, 
the  functional  T possesses  a strong  stochastic  quasi-differential.  Conditions 
for  a strict  quasi-differential  are  also  given.  Using  the  differentiability 


resxilts,  we  obtain  asymptotic  normality  and  the  law  of  the  iterated  logarithm 
for  T(F^).  Following  this  development,  we  briefly  consider  some  specific  examples. 

I 

i ‘ We  then  proceed  to  extend  the  results  to  one-step  versions  and  to  the  case  of 


scale  unknown.  Finally,  we  make  comparisons  with  related  results  in  the  literature. 


4.1.  Varieties  of  ip  function.  Different  choices  of  \p  lead  to  different 
estimators.  For  .-iainplc,  the  function  \)/(x)  = x yields  the  sample  mean,  5T. 

The  function  ip(x)  = s,«m(x)  yields  thv,  sample  median. 

Classical  rmximum  likelihood  estimation  corresponds  to  the  assumption  that 
F is  of  the  form  Fq(x  - 0),  where  6 is  the  unknown  location  parameter  and  F^ 
is  a specified  known  distribution  function.  If  F^  has  a differentiable  density 
f^  and  \p  is  given  by 

f '(x) 

(4.3)  ■J'Cx)  = ~ ^yT'  ’ < X < “, 

o 

then  is  the  maximum  likelihood  estimator  of  6.  Of  course,  this  choice  of  ip 
depends  upon  knowledge  of  F^.  In  particular,  if  F^  is  the  standard  normal,  then 
\p  is  just  ij)(x)  = X. 

Huber  (1964)  considered  modification  of  this  classical  formulation  in  order 
to  obtain  robust  estimation  of  9,  i.e.,  in  order  to  obtain  a choice  of  \p  which 
is  optimal  when  F^  is  not  completely  known.  In  particular,  for  the  case  of  F^ 
assumed  to  belong  to  a specified  neighborhood  of  the  standard  normal,  he  intro- 
duced an  associated  rainimax  problem  based  on  the  criterion  of  asymptotic  variance 
of  the  estimator  and  obtained  as  optimal  solution  a i|/  function  of  the  form 


(4.4) 


<|/(x) 


X,  |x|  < k, 
k sgn(x ),  |x|  ^ k. 


where  k is  a parameter  determined  by  the  specifications  of  the  rainimax  problem. 
Note  that  this  tp  represents  a truncation  of  the  classical  ijj(x)  = x.  In  effect, 
this  robust  estimator  is  less  sensitive  to  extreme  observations.  A number  of 
other  i|(  functions  considered  by  Huber  for  the  pxirposes  of  robust  estimation  are, 
like  (4.4),  nondecreasing  and  bounded. 

A further  type  of  modification  of  the  classical  i()(x)  = x was  introduced 


'I::?';  ^'-:riri.'::?i^ 


by  Hampel  (1968),  (1974)  for  the  purpose  of  further  reducing  the  Influence  of 
extreme  observations  on  the  error  of  estimation.  His  ip  functions  are  not  only 
bounded  but  in  addition  "redeaoend"  to  the  origin.  A typical  version  is 


(4.5) 


ij;(x) 


-ij»(  -X ) 


X,  0 < X < a, 
a,  a i X s b, 

0,  X s c. 


Similar  in  character,  but  smoother,  is  the  tp  function 


(4.6) 


lj((x)  = -i|i(-x) 


sin  dx,  0^x5;  n/d 
0,  X 2 ir/d. 


Estimators  based  on  (4.4),  (4.5)  and  (4.6)  were  studied,  along  with  many 
other  estimators,  in  the  comprehensive  Princeton  Monte  Carlo  Study  (Andrews 
et.  al,  (1972))  and  were  found  empirically  to  be  very  robust  considered  with 
various  competitors.  Furthermore,  "redescenders"  have  been  obtained  as  optimal 
solutions  in  certain  robustness  problems  ( see  Collins  ( 1976 ) and  Portnoy  ( 1977 ) ) . 

4.2.  Definition  of  M- functional . In  formulating  a precise  notion  of  the 
solution  of  (4.1)  as  a fUiictional' T(F )i  it  is  necessary  to  take  into  account 
that  the  equation  (4.1)  niay  uav.^  multiple'  solutions.  In  thv.  casc  of  a 
radoscending  \p,  this  complication  indeed  arises  and  "any  entail  solutions  outside 
the  rang.,  of  the  support: of  F.  it  is  thus  convenient  to  introduce  values 
0 1 such  tlv.t  solutions  (4.1)  outside  the  interval  [F"^(p^),  F~^(n2)] 

may  be  ignored,  vherc  as  usual  F ^(p)  = inf{x:  F(x)  > p} . (in  practice  the 

judicious  selection  of  p^  and  p^  is  based  on  the  noed  to  satisfy  certain 
assumptions  in  the  le’mnas  and  theorems  to  follow.  ) 

For  a given  function  i)»(x),  -<»  < x < <»^  and  a given  distribution  F,  we  define 
the  associated  function 

00 

(4.7)  ^p(c)  -[„  - c)  dF(x),  -00  < c < “. 


i 
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For  given  0 < < ?2  < 1,  we  define  the  set 

F;  p^,  P2)  = {c;  Ap(c)  * 0 and  F“^(p^)  s c s F'^Cp^)}. 
DEFINITION.  The  M- functional  correaponding  to  and  (pj^,  P2)  is  defined  as 

(4.8)  T(F)  = inf  C(<1»;  F;  p^,  P2),  if  C(\|>;  F;  p^,  P2)  nonempty, 

= F“^(j(pj^  + P2^^’  otherwise.  0 

Clearly,  T(F)  takes  values  only  in  the  interval  [F~^(p^),  F”^(p2)].  A more 
general  definition,  which  we  shall  not  need  in  the  present  development,  would 
substitute  the  condition  "Ap(t)  changes  sign  at  t = c"  for  the  condition 
"Xp(c)  =0."  Further  approaches  toward  definition  of  T(F)  are  discussed  in 
subsection  4.8  below. 

4.3.  NonstochaBtic  convergence  aspects  of  sequences  {T(G^)>.  For  the 
case  of  a sequence  of  distributions  {G^)  converging  vreakly  to  F,  denoted 
G «>  F,  wu  provide  conditions  under  which  TCG^)  -*■  T{F)  as  n -►  «>.  In 
particular,  conditions  on  »{'  are  specified. 

Define,  for  e 2 0,  the  open  interval 

(4.9)  Ip(£)  = (F'^(p^)  - e,  F"^(p2  * e)  + e) 

and  denote  the  closure  of  Ip( e ) by  Tp( c ) . Note  that  Tp( 0 ) contains  T( F ) . The 
role  of  Ip(e)  for  e > 0 is  seen  in  the  following  result. 

LEJWA  4.1.  Suppose  that  G^  "=>  F.  Let  0 < p < 1 and  e > 0 be  such  that 
F is  continuous  at  F~^(p)  - e and  atF^(p  + e)  + e.  {Such  choices  of  e, 
arbitrarily  small,  may  ahxiya  be  found.  ) Then,  for  all  sufficiently  large  n, 

(4.10)  F"^(p)  - e s G^'^(p)  s F“^(p  ^ c)  + e. 

PROOF.  Suppose  that  the  inequality  0^~^(p)  < F”^(p)  - e holds  for  infinitely 
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manv  n = 1,  2,  •••  . Then  the  inequalities  p < G (G  ~^(p))  s G (F~^(p)  - e)  1 

n XI  n I 

hold  for  infinitely  many  n = 1,  2,  *•'  . But  then  the  convergence  G F - 

—1  —1  —1  ^ 
yields  p s F(F  (p)  - e)  and,  equivalently,  F (p)<F~(p)-e,  a contradiction.  ■ 

Thus  the  first  inequality  in  (4.10)  holds  for  all  sufficiently  large  n.  The 

other  inequality  in  (4.10)  follows  by  a similar  argument.  0 

COROLLARY.  Suppose  that  G^  F.  Let  0 < p < 1.  If  F~^( * ) is  continuous 

at  p,  then  G^~^(p)  -»•  F~^(p),  n -►  ®.  That  is,  0^^"^  F~^. 

The  preceding  lemma  shows  that  if  G F,  then  the  interval  T (0), 

n u 

— I 

which  contains  T(G^),  must  for  large  n lie  within  the  interval  Ip(e),  which  con- 
tains T(F).  This  is  one  step  toward  the  convergence  of  '^(0^)  to  T(F).  The  next 
result  focuses  more  directly  upon  this  convergence.  To  this  effect,  we  intro-  j 

I, 

duce  a condition  which  ensures  that  T(F)  is  well-defined  as  a "target"  parameter. 

CONDITION  A.  (i)  The  equation  lp(c)  = 0 has  a unique  solution  T(F)  in  the 
interval  [F“^(pj^),  F“^(p2)],  and  Xp(  * ) changes  sign  at  T(F);  (ii)  in  fact,  T(F) 
lies  in  the  slightly  smaller  interval  (F~^(p^  + e^),  F~^(p2))  for  some  > 0; 

(ill)  moreover,  T(F)  is  the  unique  zero  of  Xp( • ) in  the  slightly  larger 
interval  [F“\pj^)  - F‘^(P2  * ^2^  * ^2^  for  some  > 0.  □ 

An  alternative  version  of  Condition  A,  which  could  be  substituted  for 
Condition  A in  the  sequel,  is  given  by  replacing  (11)  by  (li*)  F~^(p)  is 
continuous  at  p^.  Still  another  version  is  given  by  replacing  (ii)  and  (iii), 
respectively,  by  (ii')  F ^(p)  is  continuous  at  p^  and  P2,-  (iii')  T(F)  is  the 
unique  zero  of  Xp(  • ) in  [F’\p^)  - e,  F'^(p2)  + e]  for  some  e > 0. 

LEMMA  4.2.  Let  F,  p^,  P2  and  i/»  be  such  that  Condition  A holds.  Let  {G^} 
satisfy 

(4.11)  G F: 

n ’ 

* 

! (4.12)  converges  continuously  to  X ( • );  i.e.,  if  h ->■  b, 

n f n 

) 
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f 


then  \q  (b^)  •+  Xp(b),  n -►  » 
n 


(4.13) 

Then 


Xq  (c)  ia  oontinuoua  in  c,  each  n. 
n 


(4.14) 


^ T(G  ) = T(F). 


PROOF.  By  A(ii),  there  exist  and  such  that 


F‘^(Pl  + e^)  < < T(F)  < < F'^(p2). 


Then,  by  A(l),  ^p(°]^)  opposite  in  sign.  By  (4.12)  it  follows  that 

Xq  (cj^)  and  X^  (c^)  have  opposite-  signs  for  all  sufficiently  large  n.  For  such  n, 

n n 

it  follows  by  (4.13)  that  the  equation  Xq  (c)  = 0 has  a solution  in  the  interval 

n 

(Ci,  C2).  By  (4.11)  and  Lemma  4.1,  we  have,  for  all  sufficiently  large  n, 

(Cj^,  C2)  c (Gn'^(P]^)»  (F~^(Pj^)  - C2,  F“^(p2  + Ep"'  + €2^* 

Thus  there  exists  such  that  the  set  of  values  (T(G^),  n ^ N^}  are  solutions 

of  X-  (c)  = 0 and  lie  in  the  finite  interval  I„(e-).  Thus  there  exists  a point 
U r d 

ri 

c^  in  the  closed  interval  Tp(e2)  such  that  T(G  ) -♦•  c for  some  subsequence 

^ o 

{G  },  n,  i N . By  (4.12)  again,  we  have  X_(c  ) = 0.  By  A(iii),  we  have 
rij^  ic  o r o 

c^  = T(F),  We  have  thus  established  that  every  c -neighborhood  of  T(F)  must 
contain  T(G^)  for  all  but  a finite  number  of  n.  That  is,  (4.14)  holds.  □ 

The  following  result  provides  various  simple  conditions  on  {p  sufficient 
for  one  or  the  other  of  (4.12)  and  (4.13)  to  hold. 

LBifiA  4.3.  (i)  If  {p  is  continuous  and  bounded,  then  (4.13)  holds.  Also, 

Xp(c)  is  continuous  in  c.  If  also  (4.11)  holds,  then 


(4.15)  X_  (b)  X_(b),  n each  b. 

Qn  F 

(11)  If  'p  is  continuous  and  nondeareasing,  then  (4.13)  holds.  Also,  Xp(c) 


-34- 


( 

i 


i 


I 

r 

i 

» 

I 

i 

I 


is  continuous  in  c.  If  also  (4.15)  holdsj  then  (4.12)  holds,  (lii)  If  is 
uniformly  continuous  and  (4.15)  holds ^ then  (4.12)  holds. 

PROOF,  (i)  Apply  the  Dominated  Convergence  Theorem  and  the  Helly-Bray 
Theorem,  (ii)  The  first  two  statements  follow  from  the  Monotone  Convergence 
Theorem.  Now  suppose  that  (4.15)  holds.  Let  -►  b.  Let  6 > 0 be  given.  Let 

n be  sufficiently  large  that  |b^  - b[  < 6,  all  m ^ n.  Then,  since  i|;  is  non- 
decreasing, we  have 

(1)  lA^  (b^)  - A^  (b)l  < ]Aq  (b  + 6)  - Aq  (b)|  + |Aq  (b  - 6)  - Aq  (b)|. 
m m m mm  m 

Now 

|Aq  (b  + 6)  - A (b)|  S |Ap(b+  6)  - Ap(b)|  + |A  (b  + 6)  - Aj,(b  6)| 
m m ^m 

+ |A^^  (b)  - Ap(b)|  . 

*m 

Thus 


where 


Uq  \ ^ - Ap(b)|  + B(n,  6), 

ra  m 


B(n,  6)  = |Aq  (b  + 6)  - Ap(b  + 6)|  + sup  (x^  (b)  - Ap(b)|. 
m m 

By  (4.15),  B(n,  6 ) -*•  0 as  n Hence 

(2)  ^ sw  |Aq  (b  + 6)  - A (b)|  ^ |A  (b  + 6)  - Ap(b)|  . 

m ra 

Similarly, 

m>E  I^G  ^ 

m m 

Combining  (l),  (2)  and  (3),  we  have 

I^G  - ^G  ^ ^ ^ " V^^l- 

ra  m 


II 

ai 

ii 

11 

H 

i\ 

I 

I I 
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Letting  6 -►  0 in  the  right-hand  side  of  (4),  and  making  use  of  the  continuity 
of  Xp(c)  established  already,  we  obtain 

(4.16)  1^  siig  |aq  (b^)  - Xq  (b)|  = 0 , 

m m 

which  implies  (4.12). 

(iii)  Write 

(b^)  - X^  (b)|  = |/[^^(x  - b^)  - ij((x  - b)]dGj^(x)( 
m m 

< I |ii<(x  - b^)  - Hx  - b)| 

It  follows  that 

(4.17)  1^  |Xq  (b^)  - X(,  (b)|s  I iKx  - b^)  - ij/(x  - b)|  1^  = 0 , 

m m 

which,  with  (4.15),  implies  (4.12).  Q 

At  this  point  it  is  convenient  to  introduce,  for  each  e > 0,  the  set  of  dis- 
tributions 

Gp(e)  = {G:  X^(T(G))  = 0 and  T(G)  e Ip(e)}, 

which  will  play  in  the  present  context  of  M-functionals  the  role  of  the  set  G 

F 

in  the  definition  of  the  quasi-differential.  The  following  result  is  parallel 
to  Lemma  4.2.  Under  the  same  conditions,  except  that  (4.12)  is  relaxed  to  (4.15), 

it  provides  the  slightly  weaker  conclusion  that,  for  every  e > 0,  7(G^)  lies  in 

Ip(e)  for  all  sufficiently  large  n. 

LEJfflA  4.4.  Let  F,  p^,  and  be  sue?:  tliat  Condition  A ?iolds.  Let  {G^} 
satisfy  (4.11),  (4.13)  and.  (4.15).  T?%en,  for  every  e > 0, 

(4.18)  G E G„(e),  all  n sufficiently  larqe. 

nr  ‘ ‘ 

PROOF.  It  is  readily  seen  that  if  Condition  A holds,  the  constant  e^  > 0 in 

A(iii)  may  be  taken  arbitrarily  small.  Now  the  first  part  of  the  proof  of  Lemma  4.2 


establishes  that  for  all  sufficiently  large  n,  (T(G^))  = 0 and  T(G^)  e Ip(e2^' 

n 

Thus  we  have  (4.18).  □ 

4.4.  Strong  consistency  of  T(F^).  We  are  now  prepared  to  assert  important 
convergence  properties  of  and  T(F^),  analogous  to  (4.18)  'ind  (4.14). 

LEMIA.  4.5.  Let  F,  p^,  and  ip  be  such  that  Condition  A holds.  Suppose 


that 

(4.19) 

\p  is  continuous  ; 

(4.20) 

1 1^  “ ^1  loo  0,  n ->■  “>; 

(4.21) 

Xp  (c)  > Xp(c),  n •>  “ (each  c). 

Then,  for  every  e > 0, 

(4.22) 

P{F  e G-,(£),  all  n sufficiently  large} 
n r 

If,  further. 

either 

(4.23a) 

is  nondecreasing 

or 

(4.23b) 

Ip  is  uniformly  continuous. 

then 

(4.24) 

T(F^)  T(F),  n ^ . 

PROOF. 

V/riting 

(4.25) 

^F  " n ^ 

n i=l 

we  see  that  (4.19)  implies  (4.13),  with  G^  replaced  by  F^.  By  (4.20)  and  (4.21), 
we  have  that  (4.11)  and  (4.15),  with  replaced  by  F^,  hold  with  probability  1. 
Thus,  by  Lenria  4.4,  (4.22)  is  proved. 

If  either  (4.23a)  or  (4.23b)  holds,  then  by  Lemma  4.3  (ii),  (iii)  we  have 
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that  (4.12),  with  replaced  by  F^,  holds  with  probability  3,  Thus  (4.24)  is 
proved,  C 

RHIARKS.  (i)  A sufficient  condition  for  (4.20)  and  (4.21)  is  that  the 
X^'s  be  I.I.D.  This  follows  from  the  Glivehko-Cantelli  Theorem  and  the  class- 
ical Strong  Law  of  Large  Numbers. 

(ii)  An  alternative  sufficient  condition  for  (4.21)  is  that  (4.19)  and 
(4.20)  hold  and  il/  be  bounded.  This  follows  by  the  Helly-Bray  Theorem.  0 

In  view  of  Remark  (i)  above,  we  have  by  Lemma  4.5  the  following  result 
characterizing  strong  consistency  of  T(F^)  under  typical  conditions  on  tp  and 
the  Xj ' s . 

THEORBvl  4.1.  Let  F,  p^,  and  he  such  that  Condition  A holds.  Suppose 
that  either 

( 4 . 26a ) \l>  is  continuous  and  nondecreasing 

or 

(4.26b)  ip  is  uniformly  continuous. 

Let  {X^}  be  a sequence  of  independent  observations  on  F.  Then  (4.22)  and  (4.24) 
hold. 

4.5.  A quasi-differential  for  T( • ).  V/e  now  investigate  the  existence 
of  a (possibly  stochastic)  qua  .si -differential  for  an  M-functional.  Befining 


(4.27)  " s)dF^(x), 

where 

F^  = F t(G  - F), 

the  equation  (4.1)  defining  T(F^)  may  be  written 


(4.28) 


H_(T(FJ,  t)  = 0. 


By  implicit  differentiation  (with  respect  to  t)  in  the  equation  (4.28),  we  obtain 


in 

dT(F^) 

1 . 3H| 

3s 

s=T(F) 

dt 

II 

O 

-i 

= 0 


i.e. 


f 


X^(T(F))  • D^T(F)  + [X^(T(F))  - Xp(T(F))]  = 0 , 


i.e.,  if  Xp(T(F))  = 0, 


(4.29) 


and,  in  particular. 


Xf,(T(F)) 

" ■ X^T(F))  ' 


(4.30) 


D.  T(F)  = 


<l>(x  - T( F ) ) 

X^(T(FTr 


-00  < X < 00, 


corresponding  to  (2.12).  We  thus  define 


(4.31) 


T(F;  A) 


Mx  - T(F))dA(x) 
-X^(T(F)) 


and  seek  to  establish  that  this  is  a (possibly  merely  stochastic)  quasi-differential 
for  T(’).  In  particular,  we  deal  with 


(4.32) 


-r^TT.  A - U(x  - T(F))d[G(x)  - F(x)1  . 


We  next  express  T(G)  - T(F)  in  a form  suitable  for  relating  to  T(F;  G - F). 
Define 


X (t)  - X (T(F)) 

" t~T(F) ' ■*'  ^ 


= X^(T(F)),  t = T(F). 

Then  we  may  write 


(4.33)  T(G)  - T(F)  - T(F))]dF(x) 

b(T(G)) 
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► 

*v 


i 


Under  the  assumption  that 

(4.34)  Ap(T(F))  = 0 and  Aq(T(G))  = 0, 


which  will  be  satisfied  in  the  context  to  be  considered,  (4.33)  takes  the  form 
(4.35)  T(G)  - T(F)  = /»L^-  . 


Comparison  of  (4.32)  and  (4.35)  reveals  compatibility  in  the  numerators  but  not 
in  the  denominators.  However,  noting  that  h(T(G))  X^(T(F))  as  T(G)  -►  T(F), 

we  see  the  utility  of  the  cTuast-differential  approach.  Define  the  functional 

A'(T(F)) 

(4.36)  Tp(G)  = Hprnrrr  • 

Then  Tp(F)  = 1 and  (4.32)  is  equivalent  to 

(4.37)  Tp(G)  T(F;  G - F)  = , 


a form  somewhat  more  conqiatlble  with  (4.35).  Indeed,  by  the  use  of  Lennna  3.1 
with  (4.35)  and  (4.37),  we  arrive  at  the  inequality 

(4  38)  |T(G)-  T(F)  - Tp(G)  T(F;  G - F)|  j |4,(x  - T(F))  - ^(x  - T(G))| 

" TKrrnrm ' 

(This  is  still  subject  to  the  proviso  (4.34).) 

The  ineqxiality  (4.38),  in  conjunction  with  the  convergence  results  developed 
in  subsections  4.3  and  4.4,  is  very  useful  in  characterizing  T(F;  d)  as  a quasi- 
differential or  stochastic  quasi -differential  for  T(*).  We  shall  give  results 
covering  both  versions. 

THEOREM  4.2.  Let  F,  p^,  P2  and  iji  be  euch  that  Condition  A holds.  Aaewne 

that  A'(T(F))  * 0.  Suppose  that  either 
F 


— 4O— 


i is  continuous j nondecreasing  and  bounded 

}p  is  uniformly  continuous  and  bounded. 

Suppose  also  that  ^ satisfies 

(4.40  lim|  |ij)(x  - b)  - ij((x)|  L = 0. 

bnO  '' 

Then  T(F;  A)  defined  by  (4.31)  is  a quasi-differential  for  T( • ) at  F lyith  respect 

to  the  norm  ll'll^^  the  set  and  the  functional  Tp(  • ) defined  by  (4.36). 

PROOF.  Let  {G^}  satisfy  | |G^  - F|  0,  n Then  (4.11)  holds. 

Hence  also,  by  Lenma  4.3  in  conjunction  with  (4.39),  conditions  (4.12)  and 

(4.13)  hold.  Therefore,  by  Lemmas  4.2  and  4.4,  G e G„(e„)  for  all  n sufficiently 

n I*  c 

large  and  T(G^)  -►  T(F),  n ->■  *.  Thus 

(4.41)  h(T(G  ))  -K  -XI(T(F))  ^ 0 

n r 

and  (4.34)  holds  for  the  given  F and  for  G^  for  n sufficiently  large.  Therefore, 
by  (4.40)  and  (4.38),  we  obtain  (2.18a).  □ 

We  now  relax  somewhat  the  requirements  on  ip  and  still  obtain  that  T(F;  A) 
is  a quasi-differential  in  the  strong  stochastic  sense.  A proof  similar  to  that 
of  Theorem  4.2,  making  use  of  Theorem  4.1  in  place  of  Lemmas  4.2  and  4.4, 
yields  the  following  result. 

THEOR01  4.3.  Let  F,  p^,  'P  such  that  Condition  A holds.  Assume 

that  X^(T(F))  * 0.  Suppose  that  either 

(4.41a)  i|)  is  continuous  and  nondeoreasing 

or 

(4.41b)  ip  is  uniformly  continuous. 


(4.39a) 

or 

(4.39b) 
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Suppose  alao  that  ij)  aatiefie^  (4.40).  Let  {X^}  he  a sequence  of  independent  obser- 
vations on  F . Then  T( F ; A ) given  by  (4.31)  ia  a strong  stoohaatia  quaai- 
differential  for  T{‘)  at  F with  respect  to  the  norm  . |*|  sequence 

and  the  functional  Tp( • ) given  by  (4.36). 

R0iARK.  Under  the  conditions  of  the  preceding  theorems,  the  quantity  T(F;  A) 
given  by  (4.31)  is  a (stochastic)  quasi -differential  aatiafying  Condition  L. 

For  later  reference,  we  note 


(4.42) 


T[F;  X]  = - 


4>(x  - T(F)) 

"Kjnrnr 


9 


(4.43) 


y(T,  F)  = 0, 


and 

(4.44) 


. L fyyi  - T(F))dF(x) 

0*^(1,  F)  = 

[X^(T(F))]‘^ 


in  accord  with  (2.6),  (2.10)  and  (2.31).  □ 

In  subsection  2.4,  a concept  of  differential  for  a sequence  of  etatiatica, 
rather  than  a functional^  was  introduced.  Exploiting  this  approach,  we  now 
further  relax  the  conditions  on 

Let  T be  a solution  of  the  equation  (4.1),  i.e.,  satisfy  Ab>(T  ) = 0. 
o r o 

Analogous  to  T(Fj  A)  given  by  (4.31),  we  introduce 


(4.45) 


J\|>(x  - T )dA(x) 
■ --x'tij- 


F'  o' 


Analogous  to  the  function  h(t)  used  above,  we  introduce 


(4.46) 


ho(t)  = 


- V^o^ 
— m 


t T , 
o 


= W’ " • L- 

The  role  of  T(Fjj)  will  now  be  played  by  a sequence  of  statistics  {T^>  which 
essentially  are  solutions  of  the  equation  (4.2)  and  converge  to  T^.  (.Specific 
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conditions  are  given  in  the  theorem  below.  ) Corresi^nding  to  such  a sequence 
{T^},  we  define  a sequence  {Z^}  by 


(4.47) 


>7  _ F O 

^n  = hTTT" 

n 


In  the  case  that  is  indeed  a solution  of  (4.2),  i.e.,  satisfies 


(4.48)  Xp  (T^)  = 0, 

we  obtain,  by  a derivation  similar  to  that  leading  to  (4.38), 

(4.49)  l"n  - "o  - ^n  V"’  "n  " J ‘ "o^  ' “ "n^l  >7  . 


Th(T 


n 


The  following  analogue  of  Theorem  4.3  thus  follows  readily. 

THEOREM  4.4.  Let  F and  il>  be  such  that  X_(t)  = 0 has  a solution  T . 

r o 

iissMme  that  ^^(T^)  * 0.  Suppose  that  ip  is  continuous  and  satisfies  (4.40). 

Let  {Xj}  be  a sequence  of  independent  observations  on  F.  Let  the  sequence 
Tn  = Tn(Xi,  ...,  X^),  satisfy 

(4.50)  P(^p  ^'^n^  " ^ sufficiently  large)  = 1 

and 

(4.51)  P{T  -*•  T , n -►<»}  = 1. 

n o 

Define  {Z^}  by  (4.47).  Then  T^(F;  A)  given  by  (4.46)  is  a strong  stochastic 
quasi-differential  for  (T^)  <3!^  (T^,  F)  with  respect  to  the  norm  ||*||^j  the 
sequence  {X^}  and  the  sequence  {2^}* 

Let  us  compare  the  assumptions  of  Theorems  4.3  and  4.4.  In  the  latter,  con- 
ditions (4.50)  and  (4.51)  essentially  characterize  {T^}  as,  with  probability  1, 
a consistent  sequence  of  solutions  of  equation  (4.2).  Implicitly,  the  same  is 
assumed  in  Theorem  4.3  for  the  sequence  {T(F^)>.  This  is  seen  from  Theorem  4.1. 
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Regarding  conditions  on  «|i.  Theorem  4.4  retains  (4.40)  but  eliminates  (4.41). 
However,  note  that  (4.40)  almost  contains  the  ass\imption  that  ^ is  continuous. 

For,  noting  that 

(4.52)  |i(»(y^  - b)  - ip(yQ)  - [tj»(x^  - b)  - i|»(x^)]  | s ] li)«(x  - b)  - i|»(x)l  1^, 

and  putting  y = x - b in  (4.52),  we  have  by  (4.40)  that 
o o 

ij»(x  - 2b) — 2i()(x  -b)  + i;)(x)-*-0  asb-*-0. 

^ o o o 

Thus,  if  \l>  is  right-  and  left-continuous  at  x^  (as  in  the  case  of  i(i  nondecreasing), 
then 


. Ij'(x^+)  = <l'(x^-)  = ll)(x^). 

Moreover,  the  requirements  (4.50)  and  (4.51)  implicitly  restrict  4i.  Thus 
Theorem  4.4  represents  but  a mild  relaxation  of  the  assumptions  on  ip  imposed 
by  Theorem  4.3.  On  the  other  hand.  Theorem  4.4  provides  considerable  latitude 
in  the  manner  of  definition  of  a "solution"  of  the  equation  (4.2).  Nevertheless 
Theorem  4.3,  while  specific  to  the  sequence  {T(F^)},  does  not  require  a separate 
treatment  of  the  consistency  issue. 

4.6.  Asyrjptotia  vormality  and  almost  sure  behavior  of  T(F^).  We  have 
establishod  iri  Thjor'.ri  4.3  that,  under  appropriate  assur.ptlons  on  and 
the  ; [-functional  T(  • ) possesses  r strong  stochastic  quasi-differvjntial. 
now  apply  this  in  connection  »;ith  the  general  results.  Theorems  2.1  and  2.2. 

Consider  the  functional  Tp( • ) defined  by  (4.36).  Under  the  conditions  of 
Theoron  4.3,  it  follows  by  Theorem  4.1  that  T(F^)  T(F)  and  hence,  since 

h(t)  is  continuous  at  t * T(F),  that 

(4.53)  Tp(F^)  1,  n •>  ». 

Therefore,  by  Theorems  2.1  and  2.2  in  conjunction  with  (4.43 )»  we  have 
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THEORBl  4.5.  Let  F,  p^,  ^2  ’i'  such  that  Condition  A.  holds.  Assume 

that  (T(F))  = 0.  Suppose  that  either 


(4.54a) 


i|f  is  continuous  and  nondeoreasing 


or 


(4.54b) 


tl/  is  uniformly  continuous. 


Suppose  also  that  (4.40)  holds.  Assume  that  a (T,  F)  given  by  (4.44)  is  finite 


^ [T(F^)  - T(F)]  -^>  N(0,  a^(T,  F)) 


and  positive.  Let  {X^}  be  independent  observations  on  F.  Then 

(4.55) 
and 

^ [T(F^)  - T(F)] 

(4.56)  lim  S = 1 w.  p.  1. 


/2  a (T,  F)  log  log  n 


Under  the  broader  assumptions  of  Theorem  4.4,  we  have  an  analogous  result, 
based  on 


(4.57) 


% F)  = 


L - T )dF(x) 

Coo  n 


[x'(Tpr 


Specifically,  we  state 

THEOREvI  4.6.  Let  F and  ip  be  such  that  A (t)  = 0 has  a solution  T . 

r o 

2 

Assume  that  (T^,  F)  given  by  is  finite  and  positive.  Let  {X^}  be 

independent  observations  on  F.  Let  the  sequence  (T^)>  = ^n^^l’  ^n^’ 

satisfy 


(4.58a) 

and 

(4.58b) 


P{X„  (T  ) = 0,  all  n sufficiently  large)  » 1 
r n 
n 


P{T  -►T,n>“}*l. 
n o' 
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Then 


(4.59)  ^ [T^  - T^]  N(0,  F)) 

and 

[T  - T ] 

(4.60)  llm  2 2 =1  w.  p.  1. 

j ^ ^ 

V2  (T^,  F)  log  log  n 

4.7.  Discuaeion  of  specific  examples,  (i)  Regarding  restrictions  on  i|/, 
note  that  the  classical  estimator,  ij^(x)  = x,  satisfies  (4.54a)  and  (4.40).  So 
does  the  "huber"  (4 .4 ) . The  "Ilanpel"  (4.5)  and  the  "sine"  (4.6)  do  not  satisfy 
(4.54a),  .but  they  do  satisfy  (4.54b);  they  also  satisfy  (4.40).  (In  checking 
the  validity  of  (4.40),  h helpful  tool  is  the  relation  |(g||^  = J|g'(x)|dx, 
valid  for  absolutely  continuous  g.  ) For  ij;  of  the  form  i;<(x)  = x"*,  condition 
(4.40)  holds  if  and  only  if  6 <1. 

(11)  Regarding  restrictions  of  F,  the  major  issue  is  Condition  A and  its 
variations.  If  is  continuous  and  strictly  increasing,  then  (4.1)  has  at  most 
one  solution  and  virtually  any  choices  of  and  close  to  0 and  1,  respectively, 
will  allow  Condition  A to  be  satisfied.  For  "redescending"  i|;  such  as  (4.5)  and 
(4.6),  there  are  several  situations  of  interest.  If  the  supjxjrt  of  F is  the 
real  line  and  F is  sufficiently  reg^xla^  (avoiding  pathologies  and  bimodality), 
then  again  the  solution  of  (4.1)  will  be  unique.  If  F has  bounded  support, 
multiple  solutions  may  arise,  but  in  typical  cases  only  one  solution  lies  in 
the  range  of  the  support  of  F.  For  example,  if  is  given  by  (4.5)  with  a = 1.5, 
b = 3,  c = 6,  and  if  F is  uniform  on  (0,  1),  then  the  set  of  solutions  to  (4.1) 
is  (i)  u {x:  X s -6}  u {x:  i 2:  7).  In  this  case  Condition  A is  satisfied  for 
any  choices  0<Pj^<  i<P2<l.  Continuity  of  F plays  a fairly  unimportant 

role  here.  Suppose  that  F is  merely  in  the  neighborhood  of  an  "ideal"  distribu- 
tion F such  as  the  standard  normal  or  a uniform.  Then  for  typical  the  curve 
o 

Xp(c)  follows  closely  the  curve  (c).  This  follows  by  Lerrma  3.1,  which  gives 

^ o 


(4.6X) 


|Xp(c)  - Xp  (c)  5 I|F  - F^ll^  • ||ii/l|y. 
o 

(Note  that  ||'|»||y  < “ holds  for  ij»'s  of  the  forms  (4.4)  - (4.6).)  Thus,  if 
Condition  A is  satisfied  for  F^,  p^,  p2>  then  for  F in  a large  portion  of  a 
sufficiently  small  neighborhood  of  F^,  it  holds  also  for  the  same  p^,  P2. 

4.8.  One-step  M-estimators  for  location.  Discovering  solutions  to  (4.1) 
or  (4.2)  is  theoretically  easy,  but  the  actual  computations  may  be  excessively 
costly.  alternative  is  to  take  the  first  Crauss-Mewton  iteration  of  (4.2) 
as  an  estimator  of  T(F)  rather  than  the  exact  solution  f(f’jj).  Such  estimators 
were  included  in  the  Princeton  Study  (Androv;s  et  al.  (1972))  and  vere  found  to 
beliavo  very  much  like  their  full-iterate  counterparts,  even  in  the  case  of 
small  samples.  Bickel  (1975)  has  introduced  the  use  of  such  estimators  in 
the  linear  model. 

Consider  solving  X„( c ) = 0 using  Newton ' s method  with  a starting  point  T . 

r 

The  first  iteration  has  the  form 


(4.62) 


pd)  _ 


= T - 


VT) 

X^(T) 


Now  suppose  that  A„(T  ) = 0 and  that  T is  an  "initial"  estimator  which  con- 
F o n 

verges  to  T^^  in  some  stochastic  sense.  We  define  the  one-etep  M-estimator  T^ 


(4.63) 


Xp(T) 

T = T 2^— 

" “ (T) 

n 


where 


(4.64) 


A-  (t)  = i I <»'(X.  - t). 
^n  " 1=1  ^ 


Here  we  are  assuming  that  ij»'  exists  everywhere.  We  could  relax  this  requirement 


by  replacing  X^  (T^^)  in  (4.63)  by  a suitable  estimator  of  X^(T^). 
n 


For  symmetric  distributions  F and  skew-synmetric  if,  i|»(x)  = a 

natural  starting  estimator  would  be  the  sample  median  F^~^(i).  In  asyianetric 

situations,  there  is  a problem  of  finding  a starting  estimator  which  actually 

converges  to  T and  satisfies  A„(T  ) = 0.  Of  course,  if  T T and  T is 

close  to  T^,  then  (4.64)  woiild  still  give  a reasonable  (though  not  consistent) 

estimator  of  T . The  following  theorem  is  similar  to  Theoran  4.4.  Let  Z = 
o n 

n 

THEOIH4  4.7.  Let  F,  if,  and  be  such  that  XjT  ) = 0,  Xl(T  ) * 0,  and 

O r O F O 

(4.40)  holds.  Suppose  that  and  <p'  are  continuous  and  that  either 
( i ) ' satisfies 

(4.65a)  ||'l''(x  - c)  - i(»'(x)||^  = 0(1),  c -►  0 

and 

(4.65b)  lim  X'(b)  = II(T^), 

o 

or 

(il)  </;'  is  uniformly  continuous. 

Let  {Xj^}  be  a sequence  of  independent  observations  on  F.  Define  by  (4.63)  and 

let  be  an  estimator  such  that 
n 

(4.66)  /n  (T^  - T^)  = 0^(1),  n - 

Then  T(F;  A)  defined  by  (4.4^)  is  a weak  stochastic  quasi-differential  for  {T^} 
at  (Tq,  F)  w.  r.  t.  1 1*11^,  (Xj),  and  (Z^).  -Tf,  further, 

(4.66* ) = 0(  I |F^  - F|  |„),  n u.  p.  1, 

then  T(F;  A)  defined  by  (4.4^)  ie  a strong  stochastic  quasi-differential  for 
{T^}  at  (T^,  F)  w.  r.  t.  | | • | l«.  (X^)  and  {Z^}. 


■^8— 


r 


In  the  proof  we  shall  need  the  following  simple  extension  of  the  law  of 
large  numbers. 

LBIIA  4«6.  Let  {X^}  be  a sequence  of  independent  observations  on  F and  Vet 
^n  “ V^l’"’’  satisfy 


(4.67) 


wpl 


T , n 

n o’ 


If  either 

(i)  g is  continuous  and  satisfies 


(4.68a) 

and 

(4.68b) 


I |g(x  - c)  - g(x)|  L = 0(1),  c -*■  0, 


lim  /g(x  - b)  dF(x)  = Jg(x  - T ) dF(x), 
b^-T  o 

o 


or 


( ii ) g is  uniformly  continuous^ 

then 
(4.69) 


PROOF,  (i)  Write 


|/g(x  - T^)  dFjx)  - /g(x  - T^)  dF(x)|  < |/[g(x  - T^)  - g(x  - T^)]d[Fjx)  - F(x)] 

+ |/[g(x  - T^)  - g(x  - T^)]  dF(x)| 


♦ lfg(x  - T^)  d[F^(x)  - F(x)}|. 


By  Lenma  3.1,  the  first  term  is  bounded  by  ( |F^  - F|l^||g(x  - T^)  - g(x  - T )||y, 
which  0 by  (4.68a)  and  the  Glivenko-Cantelll  Theorem.  The  second  term 

» 0 by  (4.68b),  and  the  third  term  o by  the  classical  SLLN. 
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a 


J 


( il ) Write 


|/g(x  - T^O  ^ l/[g(x  - - g(^  - T^j)l  dF^(^)| 


+ |/g(x  - T^)  d[F^(x)  - F(x)]| 


The  first  term  is  bounded  by  | |g(x  - T^)  - g(x  - T^)|  |^,  which  0 by  the 

uniform  continuity  of  g.  The  second  term  > 0 by  the  SLLN.  □ 

RBtARK.  Condition  (4.68a)  is  a relaxation  of  (4.40)  (for  g = i|/)  from  o(l) 
to  0(1).  If  (4.67)  is  replaced  by  T^  -£-5>  T^,  then  the  sinalogous  weak  law 
follows.  0 

PROOF  OF  THEORM  4.7.  We  will  prove  the  "strong"  version,  the  "weak"  version 
following  €uialogously.  By  appropriate  substitutions  for  T^,  and  T(F;  F^  - F), 


we  have 


It  - T - Z T(F;  F - F)|  = T - T 
' n o n ’ n ' n o 


(4.70) 


X„  (T  ) X„  (T  ) - XjT  ) 
F n F o F o 
n n 


n n 


5 T - T - 
n o 


W 
H <-V 

n 


^ . |[4.(x  - T^)  - </<(x  - T^)]d[Fjx)  - F(x)] 


X'  (T) 
n 


The  use  of  either  (i)  or  (ii),  in  conjunction  with  (4.66')  and  Lenma  4.6  with 


g = il',  yields 


X'  (i^)-!EU  X'(T^),  „ 
n 


Since  X^(T^)  0,  the  second  term  of  (4.70)  is  o(  ( |f^  - F|)_^),  by  (4.40).  Mow 


take  h^  as  in  (4.46),  so  that 


(4.71)  - T^)  . Xp(i_^)  - Xp(T^)  = 

Substitution  for  X_(T  ) in  the  first  term  of  (4.70)  gives 
r n 


XjT„) 

h (T^)(T^  - T^) 
fi*  fp  0 n n 0 

T T F n 

no 

■ 

no 

H 

n 

n 

It  - T I 
' n o' 


h (T  ) 
o n 


‘f  »n> 
n 


This  last  expression  is  easily  seen  to  be  o( | |F^  - F| | ) w.  p.  1 since 

Tn  - To  = 0(j  |F^  - F|  |„)  and  h^(T^)A*  (T^)  1.  n - □ 

n 

The  following  theorem  uses  the  results  of  Theorem  4.7  to  provide  asymptotic 
normality  and  the  law  of  the  iterated  logarithm  for  the  one-step  M-estimator  T^ 
defined  by  (4.64). 

THE0RFI4  4.8.  Let  F,  and  T be  such  that  X_(T  ) = 0, 

O r O 

Xp(To)  *■  0,  and  (4. 40)  holds.  Suppose  that  ij/  and  ij;'  ore  continuous  and  also 

2 

that  ill'  satisfies  (4.65)  or  is  uniformly  continuous.  Suppose  that  (T^,  F) 
given  by  (4.58)  is  finite  and  positive.  Let  {X^^}  be  a sequence  of  independent 
observations  on  F.  Define  T^  by  (4.63)  and  let  T^  satisfy  (4.66).  Then 

(4.72)  /S’  (T^  - T^)  -i_>  N(0,  o^^(T^,  F)),  n -v 

If,  further,  (4.66')  holds,  then 

(T  - T ) 

(4.73)  ^ ^ ° 1 w.  p.  1. 

’^^°o^(To’  F)  log  log  n 


EXAMPLE.  Consider  found  by  Collins  (1976),  Theorem  3.1,  as  the  solution 
of  a minlnax  problem: 


^o(^)  = 


-’'’o(-x)  = 


X,  0 s X s X , 
o 

■ tanh[Jx^  (c  - x )] , 
0,  X > c, 


X s X s c, 
o 


where  x and  x,  are  related  by  x = x-,  tanh[Jx,  (c  - x )] . Suppose  that  F is 
o 1 Oi  l o 

symmetric  about  6 (playing  the  role  of  T^).  Then  Xp(e)  = 0,  X^(e)  * 0,  (4.40) 
holds,  and  <li'  is  uniformly  continuous.  Let  Then,  as  is  well- 

known,  (4.66)  holds.  If,  fxrrther,  F'(0)  > 0,  then  (4.66')  also  follows. 
Smoothed  versions  of  (4.4)  - (4.6)  provide  other  possibilities  for  i|/.  □ 

4.9  M~e8timatorSj  scale  unknoim.  In  previous  subsections  the  motivating 
statistical  setting  for  ' I-estimtion  v/as  the  simple  location  problem.  Thus 
the  functionals  defined  were  invariant  w.  r.  t.  changes  in  location 
(i.e.,  T(F(x  - u)  = T(F)  + y),  but  not  necessarily  w.  r.  t.  changes  in 
scale.  For  practical  use  we  almost  always  need  location  functionals  which 
are  also  scale  invariant  (i.e.,  T(F((x  - y )/a ) ) = aT(F)  + y,  a > 0).  The 
obvious  way  to  achieve  the  desired  invariance  for  I-functionals  is  to  define 
T(^^)  as  a solution  of 


where  S(F)  is  a scale  functional  satisfying  S(F((x  - y)/a))  = aS(F)  for  a > 0 
and  all  y.  Then  if  T(F)  is  a solution  of  (4.74),  we  have 


(4.75) 


I ! 

'4 

showing  that  T(F((x  - y)/a))  = aT(F)  + y.  One  may  choose  S(F)  independently  or 
obtain  S( F ) by  simultaneously  solving 


dP(x)  = 0 

and 

fx  dF(x)  = 0, 

where,  e.g.,  x(t)  = ti<»(t)  - 1 (see  filckel  (1975)  or  Hubst*  (1964)).  V'e  prefer 
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to  choose  S(F)  independently,  e.g.,  the  interquartile  range 


S(F)  = 


k 

f 


I 


or  any  member  of  a large  class  of  linear  functions  of  order  statistics  for  scale 

estimation.  In  either  case  we  would  expect  S(F  ) -►  S(F)  to  hold  in  some  stochas- 

n 

tic  sense.  The  natural  estimator  T(F  ) would  then  be  a solution  of 

n 

(4.76) 

First  we  make  precise  the  definition  of  M-functional  T( ■ ) for  the  scale 
\mknown  case.  Then  we  give  analogues  of  Lemmas  4.2,  4.3  and  4.4  and  Theorems 
4.1  and  4.4. 

As  previously,  we  allow  for  possible  nonuniqueness  of  solution  in  (4.74) 
and  (4.75),  by  selecting  the  smallest  solution  lying  in  Tp(0)  = [F"^(Pj^),  F”^(p2)). 
We  introduce  the  following  definitions: 


(4.77)  Xp(c,  s)  = £ dF(x). 

(4.78)  Cg(4-;  F;  p^;  p^)  = (c:  Xp(c,  S(F))  = 0 and  F'^(p^)  s c 5 F"^(P2)}. 

(4.79)  T(F)  = inf  F;  p^;  P2),  if  Cg(t|^;  F;  p^,  P2)  nonempty-, 

= ^*^(^Px  * ^2^^'  otherwise. 

(4.80)  G®(e)  = {G:  Xq(T(G),  S(G))  = 0 and  T(G)  r Ip(e)>. 

CONDITION  Ag.  (i)  The  equation  Xp(c,  S(F))  = 0 has  a unique  solution 
c = T(F)inthe  interval  [F“^(pj^),  F"^(p2)],  and  Xp(d,  S(F))  changes  sign  at 
T(F); 

(ii)  In  fact,  T(F)  lies  in  the  slightly  smaller  interval  (F  ^(p^^  + e,  ), 
F“^(P2))  for  some  > 0; 
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(iii)  Moreover,  T(F)  is  the  unique  zero  of  Ap( • , S(F))  in  the  slightly 

larger  interval  [F"^(pj^)  - ^2,  F"^(P2  * ^2^  * ^2^  ^2  ° 

The  following  analogue  of  Lenina  4.2  follows  by  similar  arguments. 

LEHiiA  4.7.  Let  S(  • )>  F>  P2>  ^ Condition  holds. 

Let  {Gj^}  satisfy  (4.11)  and 

(4.81)  S(G^)  ->  S(F),  n ->■  »; 

(4.82)  Aq  (•,•)  converges  continuously  to  Ap(*,');  i.e.,  if  (c^,  d^)  -►  (c,  d), 

n 

then  \y  (c  • 4 ) ^tj(c,  d),  n 

V.T  n-  n 1* 

n. 

(4.83)  s)  is  continuous  in  c,  each  s and  each  n. 

n 

Then 

(4.84)  yS  T(Gn)  = T(F). 

A stronger  condition  than  (4.83)  is 

(4.85)  A (c,  s),  each  n,  and  A (c,  s)  are  e&ch  jointly  continuous  in  (c,  s). 

n ^ 

The  next  result,  analogous  in  part  to  Lanma  4.3,  gives  sufficient  conditions 
for  (4.85)  and  (4.82)  to  hold. 

LHl^A  4.8.  (i)  If  'l>  is  continuous  and  hounded y then  (4.85)  holds.  If 

also  (4.11)  holds,  then 

(4.86)  Aq  (c,  s)  -►  Ap(c,  s)  as  n -*■  *,  all  (c,  s). 

n 

(ii)  If  ^ is  continuous  and  nondeoreasing , then  (4.85)  holds.  If  also 
(4.11),  (4.81),  and  (4.86)  hold,  then  (4.82)  holds. 

(Hi)  If  i>  ie  continuous  and  of  bounded  variation,  then  (4.85)  holds.  If 
also  (1g^  - F|  0 as  n •>  «,  then  (4.86)  and  (4.82)  hold. 
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RDIARK.  Since  (iil)  of  Lemma  4.3  assumes  \l>  to  be  uniformly  continuous, 
one  might  expect  {iii)  of  the  above  analogous  lemma  to  make  the  same  assump- 
tion. However,  in  proving  (iii)  of  Lemma  4.3,  we  take  advantage  of  the  fact 
that  |(x-c^)-(x-c)|  = |c^-c|  is  independent  of  x,  whereas  for  the  above 


situation. 


rX  - c 1 

n 

X - c 

d 

d 

n J 

. . 

depends  on  x.  Thus  the  stronger  ass\imption  of 


bounded  variation  is  made.  □ 

PROOF,  (i)  Apply  the  Dominated  Convergence  Theorem  and  the  Hel3y-Bray 
Theorem,  (ii)  The  first  statement  follows  from  the  Monotone  Convergence  Theorem. 
Now  suppose  (4.11),  (4.81)  and  (4.86)  hold.  Let  6^^  > 0 and  6^  > 0 be  given. 

Choose  n^  large  enough  that  |c^  - c|  <5^  and  |S(Oj^)  - S(F)|  < 62  for  all 
n 2 n^.  Then,  since  ij;  is  nondecreasing,  we  have 

IXq  (c^,  S(GJ)  - (c,  S(F))|  s |Xg  (c  + 6^,  S(F)  ♦ 62^  " ^ 

n n n n 

|X  (c  - 6 , S(F)  - 52)  - S(F))|. 

n n 

Continuing  as  in  the  proof  of  Lemma  4.3,  we  have 

(4.87)  ^ |X  (c  , S(GJ)  - X (c,  S(F))|  = 0. 

'^n  n 

Then  (4.82)  follows  from  (4.86)  and  (4.87)  since 

|X^  (c^,  S(G^))  - Xp(c,  S(F))|  s |Xq  (c^,  S(G^))  - X^  (c,  S(F))| 

■*  n n 

+ |Xq  (c,  S(F))  - Xp(c,  S(F))|. 


n 


n 


(iii)  Since  \l>  is  bounded,  (4.85)  follows  from  (i).  Since  the  condition 
IIGjj  - F|L  * ° implies  G^  F,  (4.86)  also  follows  from  (i).  Then 
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\\q  {o^,  S(G^))  - Ap(c,  S(F))|  < (c^,  S(G^))  - Xp(c  , S(G  ))| 

n 'n 

+ |Ap(c^,  S(G^))  - Xp(c,  S(F))|. 

Lemma  3.1  allows  us  to  bound  the  first  term  by  | |G^  " f1I„  * ll'I'lly  second 

term  converges  to  0 as  n -*•  « by  (4.85).  0 

lUMJik  4.9.  Let  S(  • ),  F,  p , p and  \li  he  such  that  Condition  A holds. 

1.  d s 

Suppose  that  {X^}  is  a sequence  of  observations  on  F such  that 

(4.88)  S(F^)  S(F),  n -*•  CO. 

(4.89)  I |F  - F|  I — » 0,  n 


(4.90) 


n ' '<» 


Xp  (c,  s)  > Xp  (c,  s),  n -*•  CO  (each  (c,  s)). 


Suppose  that  either 
(4.91) 


(4.92) 


i(/  is  continuous  and  nondecreasing 


\|)  is  continuous  and  of  hounded  variation. 


(4.93) 


(4.94) 


P{Fn  Gp(c2)>  all  n sufficiently  large) 


T(F  ) t(F),  n -*■  CO. 


PROOF.  Writing 


(4.95) 


1 " 

V ' n ^ 

n " 1=1 


? ..  P^i  - 1 


we  see  that  4*  continuous  Implies  (4.83)»  with  G^  replaced  by  F^.  By  (4.89)  and 

(4.90),  tre  have  that  (4.11),  (4.81),  and  (4.86),  with  G replaced  by  F , hold 

n “ 
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w.  p.  1.  Then  by  (4.91)  or  (4.92)  (via  Lemma  4.8),  we  have  that  (4.82),  with 
replaced  by  F^,  holds  w.  p.  1.  Thus  by  Lemma  4.7,  (4.93)  and  (4.94)  hold.  □ 
THEORBii  4.9.  Let  S(  * )»  F,  p. , p»,  and  be  such  that  Condition  A holds. 

1 C 8 

Suppose  that  either  (4.91)  or  (4.92)  holds.  Let  {X^}  be  a sequence  of  inde- 
pendent observations  on  F such  that 

S(F^)  > S(F),  n 

Then  (4.93)  and  (4.94)  hold. 

PROOF.  Conditions  (4.89)  and  (4.90)  hold  by  the  Glivenko-Cantelli  Theorem 
and  the  classical  Strong  Law  of  Large  Numbers.  □ 

V/e  now  establish  a differential  in  the  scale  unknown  case.  Omitting 
analogues  to  Theorems  4.2  and  4.3,  the  most  general  result  is  provided,  an 
analogue  to  Theorem  4.4.  Some  further  definitions  are  necessary.  Denote  the 
partial  derivatives  of  X„(c,  s)  at  (T  , S ) by  D^XjT  , S ) and  D_aJT  , S ). 
Define  h*  by 

h»(t)  = ^0^  ‘ ^o^  . t T 

° ° 

o 

■ =o>’  ‘ 

In  the  following  theorem,  T^  is  arbitrary;  one  jxjsslbility  is  T^  = 

T(-)  defined  by  (4.79).  Let  Z = D,X_(T  , S )/h»(T  ), 

n X r o o on 

THEORH-f  4.10.  Let  F and  il>  be  such  that  X„(t,  S ) has  a solution  T . 

r o o 

Assume  that  Dj^Xp(T^,  S^)  * 0 and  that  D2Xp(t,  s)  is  continuous  in  a neighborhood 
of  S^).  Suppose  that  is  continuous t that  (4.40)  holds,  and  that 

(4.96)  lira  ||v<(cx)  - iji(x)||  = 0. 

c-*-l  ^ 

Let  {X^}  be  a sequence  of  independent  observations  on  F.  Let  T^  * ***' 
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estimators  such  that 

(4.97)  {S^}  has  a strong  stochastic  differential  S(F;  L)  at  (S^,  F) 

w.  r.  t.  ||*||„  ond  the  sequence  {X^}; 

^ I l^n  ■ 1®)'  n -*■  ®,  w.  p.  1; 


(4.98) 


(4.99) 

(4.100) 


T T , n 

n o' 


~ ^ sufficiently  large)  - 1. 


n 


Then  {T^}  has  a strong  stochastic  quasi-differential  T(F;  a)  a*  (T^,  F)  u.  r.  t. 
1 1 ° I sequence  {X^},  and  the  sequence  (Z^),  given  by 


(4.101) 


T(F;  A)  = 


X - T 

li 

0 

<lA(x)  + S(F;  A)  D-AjT  ,S  ) 
■c  r o o 

PROOF.  For  n sufficiently  large,  ipd^,  - Ip  (T^,  s^,)  = 0 w.  p.  1,  and 
thus  w.  p.  1,  ^ 


- "o>  ' ®c>  ' V^o-  =o> 

■ V^n>  =o>  - V 

n 

= V<^n-  =o>  - V V * V <^n’  =o>-V'n’®.^ 


n 


n 


= 


X - T 


d(F(x)  - r„(x»  . »p(T^,  S„)-Xp(T_^,S^) 


» 

X - T ' 

X 

1 

►-3 

n 

- ,i> 

n 

S 

S 

. 

o J 

^ )j 

4(F„(r)  - F(x)). 
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Also, 


h»(T  ) Z T(F;  F - F)  = -(U 
o n n n •' 


X - T 


d(Fjx)  - F(x))  + S(F;  F^  - F)  S^))- 


Then,  for  n sufficiently  large,  w.  p.  1, 


o'  n"  ' n o n ' ' n 


X - T ' 

X - T ' 

N 

VI 

/ 

» n 

-s — 

- 

o 

~5 — 

d(Fjx)  - F(x)) 

1 

o J 

o 

, 

(4.102) 


X - T 

X - T V 

I 

n 

- i|- 

n 

S 

S 

o J 

n JJ 

<f(Fjx)  - F(x)) 


* IW  V - VV  '\)  ‘ S(!'i  - F)DjVT„-S„)|. 


The  first  two  terras  of  (4.102)  are  o(||F  - F||  ) as  n ®,  w.  p.  i,  by  (4.40)  and 

(4.96).  By  the  mean  value  theorem,  for  n sufficiently  large. 


X„(T  , ) - X„(T  , ‘^  ) = D-XJT»,  S»)(fi  - S ) w.  p.  1, 

F n n F'  n o 2 n'  n n o 

where  T<f  and  S*  are  such  that  T*  > T and  S*  S . Thus,  the  third 

n n n o n o 

term  of  (4.102)  can  be  bounded  by 


n o 


n 


Condition  (4.97)  and  (4.98)  then  yield  the  desired  o(||F^  - F||^)w.  p.  1 result 


Division  by  h*(T  ) poses  no  problem  since  h*(T  ) ^— > D^X  (T  , .8  ) ^ 0.  f! 
on  on  iFoo 

2 

COROLLARY.  Let  the  hypotheses  of  Theorem  4.10  hold.  Suppose  that  is 

2 

finite  and  positive,  where  a is  given  by 

fX  - T 


o = Varp{i|; 


-2 


. S(F;  - FjD^l^d^,  S^H(DjX^(T^,  5^))“. 


Then 


/n(T  - T ) N(0,  a^)  as  n 

no  o 


J 
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and 

_ /n  (T  - T ) 

lin S — =1  w.  p.  1. 

n-*«  — 

/2  o lop  lop  n 
o 

R5’"APKS.  (i)  If  D„X„(T  , S ) = 0,  then  (4.'57)  is  unnecessarv. 

2 r o o 

(11)  ”'e  could  ’^eaken  the  converpences  In  (4.‘^7 )-( 4.100 ) and  conclude  that 

(4.101)  Is  a ’•’eak  stochastic  quasi-differential  for  {T  }. 

n 

(iii)  the  \l>  functions  piven  by  (4.4)  and  (4.5)  satisfy  (4.96).  G 
4.10.  Comparisons  with  other  results. 

The  Introduction  of  ’'^-estimation  as  a formal  approach  is  due  to  Huber  (1964), 
who  defines  to  be  any  representative  taken  from  the  set  of  solutions  of  the 
equation  (4.2).  For  the  case  that  li)  is  nondecreasinp,  he  establishes  stronp  con- 
sistency and  asymptotic  normality  of  T^.  A parallel  result  is  piven  by  our 
Theorem  4.5  with  the  option  (4.54a).  Our  conditions  on  ^ are  sliphtlv  more 
stringent,  but  we  provide-'  in  addition  the  law  of  the  iterated  logarithm  (LIL). 
Huber  rives  another  theorem  which  establishes  as’pnptotic  normality  of  in  the 
case  that  h.as  a uniformly  continuous  derivative  and  under  the  assumption  that 

consistency  of  T has  already  been  shown  bv  some  method.  Our  Theorem  4.6  is 
n 

comparable  to  this  result.  Hot  only  is  our  condition  on  i!/  milder  in  nature,  but 

also  we  obtain  the  LIL  in  addition  to  the  asymntotic  normality. 

Of  course,  the  main  objective  of  our  investigation  into  '^-functionals  has 

been  to  obtain  useful  new  results  not  for  "Hubers"  but  ^or  the  "redescenders" 

Introduced  by  Hampel  (1968),  (1974).  These  statistics  are  not  adequately  handled 

by  Huber's  treatment.  Nor  does  Hampel's  treatment  take  up  questions  of  asymptotic 

normality  and  almost  sure  behavior.  However,  Collins  (1976)  establishes 

asymptotic  normality  of  T for  the  case  that  iji  is  continuous  with  continuous 

n 

derivative  A'  and  skew-symmetric  and  vanishes  outside  an  Interval  {-c,  c],  and 
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Is  governed  by  the  standard  normal  density  on  an  interval  [T^  - d,  + d] , 
d > c.  (Outside  this  interval,  F is  allowed  to  be  arbitrary. ) Collins  takes 
to  be  the  solution  of  (4.2)  obtained  by  Newton's  method  starting  with  the 
sample  median.  The  almost  siire  behavior  of  is  not  treated.  Our  Theorem  4.5 
with  the  option  (4.54b)  provides  a parallel  result.  The  restrictions  on  i(«  are 
of  roughly  the  same  strength,  although  we  do  not  require  i(i'  to  exist  everywhere, 
but  we  greatly  relax  the  requirements  on  F and  we  characterize  the  almost  sure 
behavior  of  T(F^).  Collins  also  extends  his  asymptotic  normality  result  to 
the  scale  unknown  case.  The  corollary  to  Theorem  4.10  provides  a parallel 
result,  though  here  we  essentially  require  i/)'  to  exist  everywhere.  Again,  we 
also  provide  the  almost  sure  behavior  of  T(F^). 

An  investigation  of  Carroll  (1975),  (1977)  treats  Behadur-type  (see 
Bahadur  (1966))  almost  sure  asymptotic  representations  for  M-estimators . He 
requires  that  be  bounded  and  uniformly  Lipschitz  of  order  1 and  possess  two 
continuous  bounded  derivatives  piecewise  on  intervals,  and  that  F be  Lipschitz 
in  neighborhoods  of  the  endpoints  of  these  intervals.  He  further  requires  that 
T^  be  strongly  consistent  for  T^  (which  entails  inqjlicitly  further  conditions 
on  i|)  and  F).  The  desired  representation,  once  established,  is  quite  fruitful, 
yielding  asymptotic  normality  and  the  LIL  as  by-products. 

Blckel  (1975)  treats  one-step  M-estimators  in  the  general  linear  model. 

One  of  his  alternative  conditions  (E^^,  p.  432)  requires  to  be  uniformly 

continuous.  He  provides  asymptotic  normality  but  does  not  consider  the  almost 

sure  behavior  of  T(F  ). 

n 

Portnoy  (1977)  establishes  asymptotic  normality  of  T^  in  the  case  that 
is  bounded  and  has  a derivative  which  is  bounded  and  uniformly  continuous  except 
on  a Lebesque-null  set  and  that  F is  continuous  and  symmetric  and  has  a density 
satisfying  certain  regularity  properties.  The  estimator  T^  is  taken  to  be  the 
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solution  of  (4.2)  nearest  to  a given  consistent  estimator  0 . The  almost  sure 

n 

behavior  of  is  not  treated.  However,  Portnoy  does  allow  in-dep>endence  rather 
than  strict  independence. 

Beran  ( 1977a)  establishes  Hellinger  metric  differentiability  of  M-functionals 
in  the  case  that  i);  is  strictly  monotone  and  bounded,  lim^_^  <l<(x)  > 0, 

<l»(x)  < 0,  and  \li  has  a continuous  bounded  derivative. 
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